Title :
TCBLSA: a new method of text clustering
Author :
Xu, Jian-Suo ; Wang, Zheng-Ou
Author_Institution :
Inst. of Syst. Eng., Tianjin Univ., China
Abstract :
This paper presents a new method of text clustering based on the theory of latent semantic analysis (LSA) called TCBLSA method. The vector space model (VSM) of term weight is constructed by the theory of LSA and the TF.IDF method. The present method decreases the dimension of vector, and eliminates disadvantageous factors in the VSM. Furthermore, the method advances the speed and precision of text clustering. Through analyzing experimental data, we demonstrate that the TCBLSA method is effective and feasible for text clustering.
Keywords :
singular value decomposition; text analysis; latent semantic analysis theory; singular value decomposition; term weight; text clustering; vector space model; Clustering methods; Data analysis; Frequency; Functional analysis; Machine learning; Matrix decomposition; Singular value decomposition; Statistical analysis; Systems engineering and theory; Text mining;
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
DOI :
10.1109/ICMLC.2003.1264443