Title :
Chinese text mining based on subspace clustering
Author :
Zhang, Yan ; Jiang, Mingyan
Author_Institution :
Sch. of Inf. Sci. & Eng., Shandong Univ., Jinan, China
Abstract :
Several features existed in Chinese texts result in technologic bottleneck in Chinese text mining, at present the results of Chinese text clustering obtained by traditional methods are not very satisfactory. In this paper, we propose the text clustering method by the English texts clustering method called as Text Clustering via Particle Swarm Optimizer (TCPSO) to solve the Chinese text clustering problem. We preprocess the Chinese texts, and apply TCPSO to Chinese texts mining. The simulation results on text dataset selected from Chinese Nature Language Processing (CNLP) show that this approach effectively improves the quality of clustering and gets better results compared with k-means algorithm.
Keywords :
data mining; natural language processing; particle swarm optimisation; pattern clustering; text analysis; Chinese nature language processing; Chinese text clustering; Chinese text mining; English texts clustering; particle swarm optimizer; Clustering algorithms; Clustering methods; Computers; Fuzzy systems; Particle swarm optimization; Text mining; Transforms; Chinese text mining; TCPSO; subspace clustering;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
DOI :
10.1109/FSKD.2010.5569363