DocumentCode :
2791390
Title :
Text clustering algorithm based on spectral graph seriation
Author :
Wensheng, Guo ; Guohe, Li
Author_Institution :
Dept. of Comput. Sci. & Technol., China Univ. of Pet.-Beijing, Changping, China
fYear :
2009
fDate :
17-19 June 2009
Firstpage :
4255
Lastpage :
4259
Abstract :
In the field of information processing, most of the existing text clustering algorithm is based on vector space model (VSM). However, VSM can not effectively express the structure of the text so that it can not fully express the semantic information of the text. In order to improve the ability of expression in the semantic information, this paper presents a new text structure graph model. With the weighted graph, this model expresses the characteristics term of the text and its associated location information. On this basis of spectral graph seriation, a spectral clustering algorithm is put forward. This algorithm replace solving common subgraph with matrix computation, then reduce the computational complexity of graph clustering. There are also algorithm analysis and experiment in the paper. The results of the study show that the text clustering algorithm based on spectral graph seriation is effective and feasible.
Keywords :
computational complexity; graph theory; matrix algebra; pattern clustering; text analysis; computational complexity; information processing; semantic information; spectral clustering algorithm; spectral graph seriation; text clustering algorithm; text structure graph model; vector space model; weighted graph; Clustering algorithms; Computational complexity; Computer science; Frequency; Geoscience; Inference algorithms; Information technology; Laboratories; Natural languages; Space technology; Graph Model; Spectral Graph Theory; Text Clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control and Decision Conference, 2009. CCDC '09. Chinese
Conference_Location :
Guilin
Print_ISBN :
978-1-4244-2722-2
Electronic_ISBN :
978-1-4244-2723-9
Type :
conf
DOI :
10.1109/CCDC.2009.5192371
Filename :
5192371
Link To Document :
بازگشت