DocumentCode :
1962851
Title :
A New Chinese Text Feature Selection Method in Centroid-Based Classifier
Author :
Gu, Yijun ; Wang, Rong ; Wang, Jianhua ; Yu, Jiangde
Author_Institution :
Coll. of Inf. Security & Eng., Chinese People´´s Public Security Univ., Beijing
fYear :
2008
fDate :
23-25 May 2008
Firstpage :
88
Lastpage :
92
Abstract :
Feature selection method based on text study is a mainstream method currently, whose research key lies in finding out one suitable feature assessment method, which can reduce the numbers of the words to be processed as less as possible in the situation of not decreasing classification precision, to improve the speed and the efficiency of classification. A new feature assessment method entropy ratio is proposed in this paper on the base of researching the classical feature assessment methods in the existing literature. This method not only considered feature classification ability, but also the feature generalization ability. It is a new and better choice to apply the centroid-based classifier to improve the effect of classification. Experimental results show that the effect obtained by using this method to select features is obviously superior to the one obtained by other methods, especially when the feature selected is less.
Keywords :
classification; feature extraction; natural languages; text analysis; Chinese text feature selection method; centroid-based classifier; entropy ratio; Bayesian methods; Classification tree analysis; Educational institutions; Frequency; Information processing; Information security; Standardization; Support vector machine classification; Support vector machines; Text categorization; Automatic text classification; Centroid-Based Classifier; Text Feature Selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Processing (ISIP), 2008 International Symposiums on
Conference_Location :
Moscow
Print_ISBN :
978-0-7695-3151-9
Type :
conf
DOI :
10.1109/ISIP.2008.108
Filename :
4554063
Link To Document :
بازگشت