• DocumentCode
    2185455
  • Title

    Integrating compound terms in Bayesian text classification

  • Author

    Bai, Jing ; Nie, Jian-Yun ; Cao, Guihong

  • Author_Institution
    Departement d´´Informatique et de Recherche Operationnelle, Univ. de Montreal, Que., Canada
  • fYear
    2005
  • fDate
    19-22 Sept. 2005
  • Firstpage
    598
  • Lastpage
    601
  • Abstract
    Text classification usually assumed a word-based document representation. In this paper, we propose a new approach to integrate compound terms in Bayesian text classification. Compound terms are used as complementary features to single words. An acute problem is to consider their dependence with the component words. In this paper, we propose to use smoothing techniques to combine both compound term and word representations. Experiments have been conducted on two corpora. Our results show that this approach can slightly but steadily improve the classification performance on both test corpora.
  • Keywords
    Bayes methods; classification; text analysis; Bayesian text classification; compound term; smoothing technique; word-based document representation; Bayesian methods; Information retrieval; Niobium; Smoothing methods; Support vector machine classification; Support vector machines; Testing; Text categorization; Text recognition; Tree data structures;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2005. Proceedings. The 2005 IEEE/WIC/ACM International Conference on
  • Print_ISBN
    0-7695-2415-X
  • Type

    conf

  • DOI
    10.1109/WI.2005.79
  • Filename
    1517916