DocumentCode :
3362701
Title :
An Improved Ambiguity Measure Feature Selection for Text Categorization
Author :
Liu, Zhiying ; Yang, Jieming
Author_Institution :
Coll. of Inf. Eng., Northeast Dianli Univ., Jilin, China
Volume :
1
fYear :
2012
fDate :
26-27 Aug. 2012
Firstpage :
220
Lastpage :
223
Abstract :
The high dimensionality of the text categorization raises big hurdles in applying many sophisticated learning algorithms to the text categorization. Feature selection, which reduces the number of features that represent documents, is an absolute requirement in text categorization. In this paper, we proposed a feature selection method, which improved the performance of the Ambiguity Measure feature selection. We compare the proposed method with four feature selections (Information Gain, Ambiguity Measure, Odd Ratios and Mutual Information) using two classification algorithms (Naïve Bayes and Support Vector Machines) on three datasets (20-newgroups, Reuters-21578 and WebKB). The experiments show that the proposed method is significantly better than AM and MI, and achieves comparable performance with IG and OR.
Keywords :
Bayes methods; learning (artificial intelligence); pattern classification; support vector machines; text analysis; AM; IG; MI; OR; ambiguity measure feature selection; classification algorithms; feature selection method; naïve Bayes; sophisticated learning algorithms; support vector machines; text categorization; Accuracy; Algorithm design and analysis; Classification algorithms; Mutual information; Support vector machines; Text categorization; Training; dimensionally reduction; feature selection; text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2012 4th International Conference on
Conference_Location :
Nanchang, Jiangxi
Print_ISBN :
978-1-4673-1902-7
Type :
conf
DOI :
10.1109/IHMSC.2012.62
Filename :
6305666
Link To Document :
بازگشت