Title :
WAF-based document clustering algorithm
Author :
Luo, Yang ; Chen, Guang ; Zhang, Yongtian
Author_Institution :
Sch. of Inf. & Commun., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
This paper proposes a novel document cluster algorithm based on Word Activation Forces (WAFs), a type of newly presented statistics.. A matrix of WAFs captures the information of terms occurrence and co-occurrence in a document, reflecting the underlying semantics that have not ever been considered from the current document representations. Its main consideration is that the same word in different documents may form disparate relation net which can be used to gain the similarities of the documents. Experimental evaluations on the dataset of the CLP2010 show that our proposed method is efficient and accurate for documents clustering.
Keywords :
pattern clustering; statistical analysis; text analysis; word processing; CLP2010; WAF matrix; WAF-based document clustering algorithm; document cooccurrence; document representations; word activation forces; Clustering algorithms; Document clustering; Document representation; WAF;
Conference_Titel :
Computer Science and Network Technology (ICCSNT), 2011 International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4577-1586-0
DOI :
10.1109/ICCSNT.2011.6181899