DocumentCode
3089488
Title
Chinese Text Classification Based on Extended Naïve Bayes Model with Weighed Positive Features
Author
Qiu, Yaying ; Yang, Guangming ; Tan, Zhenhua
Author_Institution
Software Coll., Northeastern Univ., Shenyang, China
fYear
2010
fDate
17-19 Sept. 2010
Firstpage
243
Lastpage
246
Abstract
As a simple but efficient classification method, Naive Bayes algorithm has shown its desirable characters in many fields. However, the effect still needs to be improved for applying in practice. In this paper, we construct an extended model with assigning weights to some important features. A method called CF is used to measure the relevance between a feature and a category to make up the deficiency of CHI-Square statistic method. We select best features based on a new proposed method called CHCFW to reinforce the distribution of key features in a document and remove the disturbed features. Compared with the original Naïve Bayes model and other algorithm to assign weight to features, the experiment results show that CHCFW method performs better and more appropriate to larger amounts of training documents.
Keywords
Bayes methods; natural language processing; pattern classification; text analysis; CHI-Square statistic method; Chinese text classification; extended Naïve Bayes model; weighed positive features; Classification algorithms; Computational modeling; Machine learning; Signal processing algorithms; Text categorization; Training; Vocabulary; CHCFW method; Extended Bayes Model; assign weight; dependency type; feature selection;
fLanguage
English
Publisher
ieee
Conference_Titel
Pervasive Computing Signal Processing and Applications (PCSPA), 2010 First International Conference on
Conference_Location
Harbin
Print_ISBN
978-1-4244-8043-2
Electronic_ISBN
978-0-7695-4180-8
Type
conf
DOI
10.1109/PCSPA.2010.66
Filename
5635946
Link To Document