Title :
An Improved Semantic Smoothing Model for Model-Based Document Clustering
Author :
Cai, Jiarong ; Liu, Yubao ; Yin, Jian
Author_Institution :
Sun Yat-Sen Univ., Guangzhou
fDate :
July 30 2007-Aug. 1 2007
Abstract :
Recently, semantic smoothing is proposed as an efficient solution for the improvement of document cluster quality. However, the existing semantic smoothing model is not effective for partitional clustering to enhance the document clustering quality. In this paper, inspired by the TF*IDF schema and background elimination strategy, we first introduce an improved semantic smoothing model, which is suitable for both agglomerative and partitional clustering. Based on the improved semantic smoothing model, two model-document clustering algorithms, the partitional clustering algorithm and the agglomerative clustering algorithm, are also presented. The experimental results show our algorithms are more effective than the previous methods to improve the cluster quality.
Keywords :
pattern clustering; text analysis; agglomerative-partitional clustering; model-based text document clustering; semantic smoothing model; Artificial intelligence; Clustering algorithms; Computer science; Data mining; Distributed computing; Information retrieval; Partitioning algorithms; Planets; Smoothing methods; Software engineering;
Conference_Titel :
Software Engineering, Artificial Intelligence, Networking, and Parallel/Distributed Computing, 2007. SNPD 2007. Eighth ACIS International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-0-7695-2909-7
DOI :
10.1109/SNPD.2007.155