DocumentCode :
2866920
Title :
An Improved X2 (CHI) Statistics Method for Text Feature Selection
Author :
Yan, Tang ; Ting, Xiao
Author_Institution :
Coll. of Comput. & Inf. Sci., Southwest Univ., Chongqing, China
fYear :
2009
fDate :
11-13 Dec. 2009
Firstpage :
1
Lastpage :
4
Abstract :
Feature selection is a hot topic in current search field, especially in the field of text categorization. To overcome the shortcomings of traditional χ2 (CHI) approach, an improved χ2 (CHI) statistics method is proposed in this paper. It comprehensively takes criterions such as Document Frequency and Class Accuracy of the traditional statistical methods to improve χ2 (CHI) statistical method. The experiments results show that the proposed method is more effective than the traditional χ2 (CHI) method.
Keywords :
data mining; statistical analysis; 2 CHI statistics method; class accuracy criterion; document frequency criterion; feature selection; text categorization; Data mining; Educational institutions; Entropy; Frequency; Information science; Mutual information; Statistical analysis; Statistics; Text categorization; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Software Engineering, 2009. CiSE 2009. International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-4507-3
Electronic_ISBN :
978-1-4244-4507-3
Type :
conf
DOI :
10.1109/CISE.2009.5366401
Filename :
5366401
Link To Document :
بازگشت