DocumentCode :
3722611
Title :
Improved Expected Cross Entropy Method for Text Feature Selection
Author :
Guohua Wu;Liuyang Wang;Nailiang Zhao;Hairong Lin
Author_Institution :
Sch. of Comput. Sci. &
fYear :
2015
Firstpage :
49
Lastpage :
54
Abstract :
Feature selection plays an important role in text categorization, and contributes directly to the accuracy of the categorization. In the process of feature selection, due to the lack of consideration of the traditional expected cross entropy algorithm for document frequency, we first improve the expected cross entropy formula of the traditional, and then propose an improved text feature selection based on the text word frequency information. The method is modified by the expected cross entropy algorithm in three aspects of the frequency of features within category, the frequency distribution within category and the frequency distribution among different categories. The result of text categorization show that improved expected cross entropy feature selection approach has a more excellent effect in text categorization.
Keywords :
"Entropy","Text categorization","Classification algorithms","Algorithm design and analysis","Computer science","Training","Computers"
Publisher :
ieee
Conference_Titel :
Computer Science and Mechanical Automation (CSMA), 2015 International Conference on
Type :
conf
DOI :
10.1109/CSMA.2015.17
Filename :
7371621
Link To Document :
بازگشت