DocumentCode :
2183605
Title :
HDGSOMr: a high dimensional growing self-organizing map using randomness for efficient Web and text mining
Author :
Amarasiri, Rasika ; Alahakoon, Damminda ; Smith, Kate ; Premaratne, Malin
Author_Institution :
Sch. of Bus. Syst., Monash Univ., Australia
fYear :
2005
fDate :
19-22 Sept. 2005
Firstpage :
215
Lastpage :
221
Abstract :
Mining of text data from the Web has become a necessity in modern days due to the volumes of data available on the Web. While searching for information on the Web using search engines is popular, to analyze the content on large collections of Web pages, feature map techniques are still popular. One of the problems associated with processing large collections of text data from the Web using feature map techniques is the time taken to cluster them. This paper presents an algorithm based on a growing variant of the self organizing map called the HDGSOMr. This novel algorithm incorporates randomness into the self-organizing process to produce higher quality clusters within few epochs and utilizing smaller neighborhood sizes resulting in a significant reduction in overall processing time. Details of the HDGSOMr algorithm and results of processing large collections of text data proving the efficiency of the algorithm are also presented.
Keywords :
Internet; data mining; search engines; self-organising feature maps; text analysis; HDGSOMr; Web mining; Web page; information search; search engine; self-organizing map; text mining; Clustering algorithms; Computer science; Data mining; Information analysis; Organizing; Search engines; Text mining; Web pages; Web sites; World Wide Web;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Intelligence, 2005. Proceedings. The 2005 IEEE/WIC/ACM International Conference on
Print_ISBN :
0-7695-2415-X
Type :
conf
DOI :
10.1109/WI.2005.70
Filename :
1517845
Link To Document :
بازگشت