• DocumentCode
    2728745
  • Title

    Text Feature Ranking Based on Rough-set Theory

  • Author

    Tan, Songbo ; Wang, Yuefen ; Cheng, Xueqi

  • Author_Institution
    Chinese Acad. of Sci., Beijing
  • fYear
    2007
  • fDate
    2-5 Nov. 2007
  • Firstpage
    659
  • Lastpage
    662
  • Abstract
    With the aim to reduce the dimensionality without sacrificing classification performance, the author gains insights from attribute reduction based on discernibility matrix in rough-set theory and proposes two text feature selection algorithms, i.e., DB1 and DB2. The experimental results indicate that DB2 not only yields much higher accuracy than information gain when the number of features is smaller than 6000, but also incurs much smaller CPU time than information gain.
  • Keywords
    rough set theory; text analysis; attribute reduction; discernibility matrix; information gain; rough-set theory; text feature ranking; text feature selection algorithm; Classification algorithms; Computers; Feature extraction; Frequency; Geology; Iron; Performance gain; Symmetric matrices; Text categorization; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, IEEE/WIC/ACM International Conference on
  • Conference_Location
    Fremont, CA
  • Print_ISBN
    978-0-7695-3026-0
  • Type

    conf

  • DOI
    10.1109/WI.2007.31
  • Filename
    4427168