DocumentCode :
2671502
Title :
WTCA: A Web text clustering algorithm based on DFSSM
Author :
Yu, Zheng ; Rong, Qian
Author_Institution :
Coll. of Sci., Northeast Forestry Univ., Harbin
fYear :
2008
fDate :
16-18 July 2008
Firstpage :
811
Lastpage :
816
Abstract :
A key challenge of data mining is to tackling the problem of mining richly structured datasets such as Web pages. In this paper, we propose a Web text clustering algorithm (WTCA) based on DFSSM, which is our original work. The algorithm includes the training stage of SOM and the clustering stage. It can distinguish the most meaningful features from the Concept Space without the evaluation function. We applied the algorithm to the Chinese Modern Long-distance Education Network, and compared our work with some popular clustering algorithms. The experimental results show that the average accuracy of WTCA is better than that of the other three algorithms.
Keywords :
Internet; data mining; pattern clustering; text analysis; Web text clustering algorithm; data mining; discovery feature subspace model; self-organizing map; Clustering algorithms; Computer science; Data mining; Educational institutions; Electronic mail; Forestry; Humans; Text mining; Web pages; World Wide Web; Clustering analysis; Richly structured datasets; SOM; Web text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control Conference, 2008. CCC 2008. 27th Chinese
Conference_Location :
Kunming
Print_ISBN :
978-7-900719-70-6
Electronic_ISBN :
978-7-900719-70-6
Type :
conf
DOI :
10.1109/CHICC.2008.4605816
Filename :
4605816
Link To Document :
بازگشت