• DocumentCode
    1656714
  • Title

    Research on Text Feature Selection Algorithm Based on Information Gain and Feature Relation Tree

  • Author

    Hong Zhang ; Yong-gong Ren ; Xue Yang

  • Author_Institution
    Sch. of Comput. & Inf. Technol., Liaoning Normal Univ., Dalian, China
  • fYear
    2013
  • Firstpage
    446
  • Lastpage
    449
  • Abstract
    The classification performance of previous IG algorithm may decline obviously because of the maldistribution of classes and features, due to which an improved text feature selection method UDsIG is proposed. First, we select features by classes to reduce the impact on feature selection when the classes are unevenly distributed. After that, we use feature equilibrium of distribution to decrease the interference with feature selection when features are unevenly distributed. And then we deal with class features by feature relation tree model, thus to retain strong correlation features. Finally, we use the improved information gain formula, which is based on weighed dispersion, to get the optimal feature subset. The experimental results show the proposed method has better classification performance.
  • Keywords
    feature selection; pattern classification; text analysis; IG algorithm; UDsIG; class maldistribution; classification performance; feature equilibrium; feature maldistribution; feature relation tree model; information gain; text feature selection algorithm; weighed dispersion; Classification algorithms; Computers; Correlation; Dispersion; Educational institutions; Feature extraction; Text categorization; feature relation tree; feature selection; information gain; weighed dispersion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Information System and Application Conference (WISA), 2013 10th
  • Conference_Location
    Yangzhou
  • Print_ISBN
    978-1-4799-3218-4
  • Type

    conf

  • DOI
    10.1109/WISA.2013.90
  • Filename
    6778681