• DocumentCode
    3089488
  • Title

    Chinese Text Classification Based on Extended Naïve Bayes Model with Weighed Positive Features

  • Author

    Qiu, Yaying ; Yang, Guangming ; Tan, Zhenhua

  • Author_Institution
    Software Coll., Northeastern Univ., Shenyang, China
  • fYear
    2010
  • fDate
    17-19 Sept. 2010
  • Firstpage
    243
  • Lastpage
    246
  • Abstract
    As a simple but efficient classification method, Naive Bayes algorithm has shown its desirable characters in many fields. However, the effect still needs to be improved for applying in practice. In this paper, we construct an extended model with assigning weights to some important features. A method called CF is used to measure the relevance between a feature and a category to make up the deficiency of CHI-Square statistic method. We select best features based on a new proposed method called CHCFW to reinforce the distribution of key features in a document and remove the disturbed features. Compared with the original Naïve Bayes model and other algorithm to assign weight to features, the experiment results show that CHCFW method performs better and more appropriate to larger amounts of training documents.
  • Keywords
    Bayes methods; natural language processing; pattern classification; text analysis; CHI-Square statistic method; Chinese text classification; extended Naïve Bayes model; weighed positive features; Classification algorithms; Computational modeling; Machine learning; Signal processing algorithms; Text categorization; Training; Vocabulary; CHCFW method; Extended Bayes Model; assign weight; dependency type; feature selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pervasive Computing Signal Processing and Applications (PCSPA), 2010 First International Conference on
  • Conference_Location
    Harbin
  • Print_ISBN
    978-1-4244-8043-2
  • Electronic_ISBN
    978-0-7695-4180-8
  • Type

    conf

  • DOI
    10.1109/PCSPA.2010.66
  • Filename
    5635946