DocumentCode
1656714
Title
Research on Text Feature Selection Algorithm Based on Information Gain and Feature Relation Tree
Author
Hong Zhang ; Yong-gong Ren ; Xue Yang
Author_Institution
Sch. of Comput. & Inf. Technol., Liaoning Normal Univ., Dalian, China
fYear
2013
Firstpage
446
Lastpage
449
Abstract
The classification performance of previous IG algorithm may decline obviously because of the maldistribution of classes and features, due to which an improved text feature selection method UDsIG is proposed. First, we select features by classes to reduce the impact on feature selection when the classes are unevenly distributed. After that, we use feature equilibrium of distribution to decrease the interference with feature selection when features are unevenly distributed. And then we deal with class features by feature relation tree model, thus to retain strong correlation features. Finally, we use the improved information gain formula, which is based on weighed dispersion, to get the optimal feature subset. The experimental results show the proposed method has better classification performance.
Keywords
feature selection; pattern classification; text analysis; IG algorithm; UDsIG; class maldistribution; classification performance; feature equilibrium; feature maldistribution; feature relation tree model; information gain; text feature selection algorithm; weighed dispersion; Classification algorithms; Computers; Correlation; Dispersion; Educational institutions; Feature extraction; Text categorization; feature relation tree; feature selection; information gain; weighed dispersion;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Information System and Application Conference (WISA), 2013 10th
Conference_Location
Yangzhou
Print_ISBN
978-1-4799-3218-4
Type
conf
DOI
10.1109/WISA.2013.90
Filename
6778681
Link To Document