DocumentCode :
402859
Title :
TCBLSA: a new method of text clustering
Author :
Xu, Jian-Suo ; Wang, Zheng-Ou
Author_Institution :
Inst. of Syst. Eng., Tianjin Univ., China
Volume :
1
fYear :
2003
fDate :
2-5 Nov. 2003
Firstpage :
63
Abstract :
This paper presents a new method of text clustering based on the theory of latent semantic analysis (LSA) called TCBLSA method. The vector space model (VSM) of term weight is constructed by the theory of LSA and the TF.IDF method. The present method decreases the dimension of vector, and eliminates disadvantageous factors in the VSM. Furthermore, the method advances the speed and precision of text clustering. Through analyzing experimental data, we demonstrate that the TCBLSA method is effective and feasible for text clustering.
Keywords :
singular value decomposition; text analysis; latent semantic analysis theory; singular value decomposition; term weight; text clustering; vector space model; Clustering methods; Data analysis; Frequency; Functional analysis; Machine learning; Matrix decomposition; Singular value decomposition; Statistical analysis; Systems engineering and theory; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2003 International Conference on
Print_ISBN :
0-7803-8131-9
Type :
conf
DOI :
10.1109/ICMLC.2003.1264443
Filename :
1264443
Link To Document :
بازگشت