DocumentCode
2728745
Title
Text Feature Ranking Based on Rough-set Theory
Author
Tan, Songbo ; Wang, Yuefen ; Cheng, Xueqi
Author_Institution
Chinese Acad. of Sci., Beijing
fYear
2007
fDate
2-5 Nov. 2007
Firstpage
659
Lastpage
662
Abstract
With the aim to reduce the dimensionality without sacrificing classification performance, the author gains insights from attribute reduction based on discernibility matrix in rough-set theory and proposes two text feature selection algorithms, i.e., DB1 and DB2. The experimental results indicate that DB2 not only yields much higher accuracy than information gain when the number of features is smaller than 6000, but also incurs much smaller CPU time than information gain.
Keywords
rough set theory; text analysis; attribute reduction; discernibility matrix; information gain; rough-set theory; text feature ranking; text feature selection algorithm; Classification algorithms; Computers; Feature extraction; Frequency; Geology; Iron; Performance gain; Symmetric matrices; Text categorization; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Web Intelligence, IEEE/WIC/ACM International Conference on
Conference_Location
Fremont, CA
Print_ISBN
978-0-7695-3026-0
Type
conf
DOI
10.1109/WI.2007.31
Filename
4427168
Link To Document