Title :
Clustering Algorithm on Block Division of Documents
Author :
Liu, Gang ; Luo, Mingyue
Author_Institution :
Sch. of Electron. & Eng., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
In the traditional K-means algorithm, the selection of cluster number and the initial cluster center brings huge affection on the quality of clustering. To reduce the dependence on the initial center and to locate the types of new data rapidly, an algorithm applicable for text data is proposed. In this algorithm, document density is considered as parameter. Documents are divided into blocks first. After that every divided block is clustered separately. Experiment shows that this algorithm not only makes higher quality for clustering, but also does well in the new increasing data.
Keywords :
document handling; pattern clustering; K-means algorithm; clustering algorithm; clustering quality; document block division; document density; Algorithm design and analysis; Clustering algorithms; Computational modeling; Fluctuations; Internet; Partitioning algorithms; Vocabulary;
Conference_Titel :
Wireless Communications Networking and Mobile Computing (WiCOM), 2010 6th International Conference on
Conference_Location :
Chengdu
Print_ISBN :
978-1-4244-3708-5
Electronic_ISBN :
978-1-4244-3709-2
DOI :
10.1109/WICOM.2010.5600166