• DocumentCode
    2844264
  • Title

    An efficient feature selection using multi-criteria in text categorization

  • Author

    Doan, Son ; Horiguchi, Susumu

  • Author_Institution
    Graduate Sch. of Inf. Sci., Japan Adv. Inst. of Sci. & Technol., Ishikawa, Japan
  • fYear
    2004
  • fDate
    5-8 Dec. 2004
  • Firstpage
    86
  • Lastpage
    91
  • Abstract
    Text categorization is a problem of assigning a document into one or more predefined classes. One of the most interesting issues in text categorization is feature selection. This paper proposes a novel approach in feature selection based on multicriteria ranking of features. Based on a threshold value for each criterion, a new procedure for feature selection is proposed and applied to a text categorization. Experiments dealing with the Reuters-21578 benchmark data and the naive Bayes algorithm show that the proposed approach outperforms performances in compare to conventional feature selection methods.
  • Keywords
    Bayes methods; feature extraction; learning (artificial intelligence); pattern classification; text analysis; Reuters-21578 benchmark data; criterion threshold value; feature selection methods; multicriteria feature ranking; naive Bayes algorithm; text categorization; Data mining; Electronic mail; Feature extraction; Filtering; Filters; Information science; NP-hard problem; Natural languages; Text categorization; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hybrid Intelligent Systems, 2004. HIS '04. Fourth International Conference on
  • Print_ISBN
    0-7695-2291-2
  • Type

    conf

  • DOI
    10.1109/ICHIS.2004.20
  • Filename
    1409986