DocumentCode :
2850300
Title :
Text clustering based on the improved TFIDF by the iterative algorithm
Author :
Wang, Xingheng ; Cao, Jun ; Liu, Yao ; Gao, Shi ; Deng, Xue
Author_Institution :
Sch. of Inf. Sci. & Technol., East China Normal Univ., Shanghai, China
fYear :
2012
fDate :
24-27 June 2012
Firstpage :
140
Lastpage :
143
Abstract :
Text clustering, an important part of the machine learning and pattern recognition, has extensive applications in the field of natural language processing. In this paper, a method is given to improve the classic TFIDF algorithm on its shortcomings. This paper classifies the text through Naive Bayesian classifier. And uses the iterative algorithm to optimize the selection of feature words, and then to optimize the classification ceaselessly. Experimental results show that the algorithm has preferable efficiency in feature-select and can increase classification accuracy.
Keywords :
iterative methods; learning (artificial intelligence); natural language processing; pattern clustering; text analysis; Naive Bayesian classifier; feature words; feature-selection; improved TFIDF algorithm; iterative algorithm; machine learning; natural language processing; pattern recognition; text clustering; Accuracy; Filtering; Text categorization; Naive Bayesian; TFIDF; VSM; iterative algorithm; text clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electrical & Electronics Engineering (EEESYM), 2012 IEEE Symposium on
Conference_Location :
Kuala Lumpur
Print_ISBN :
978-1-4673-2363-5
Type :
conf
DOI :
10.1109/EEESym.2012.6258608
Filename :
6258608
Link To Document :
بازگشت