DocumentCode :
3089488
Title :
Chinese Text Classification Based on Extended Naïve Bayes Model with Weighed Positive Features
Author :
Qiu, Yaying ; Yang, Guangming ; Tan, Zhenhua
Author_Institution :
Software Coll., Northeastern Univ., Shenyang, China
fYear :
2010
fDate :
17-19 Sept. 2010
Firstpage :
243
Lastpage :
246
Abstract :
As a simple but efficient classification method, Naive Bayes algorithm has shown its desirable characters in many fields. However, the effect still needs to be improved for applying in practice. In this paper, we construct an extended model with assigning weights to some important features. A method called CF is used to measure the relevance between a feature and a category to make up the deficiency of CHI-Square statistic method. We select best features based on a new proposed method called CHCFW to reinforce the distribution of key features in a document and remove the disturbed features. Compared with the original Naïve Bayes model and other algorithm to assign weight to features, the experiment results show that CHCFW method performs better and more appropriate to larger amounts of training documents.
Keywords :
Bayes methods; natural language processing; pattern classification; text analysis; CHI-Square statistic method; Chinese text classification; extended Naïve Bayes model; weighed positive features; Classification algorithms; Computational modeling; Machine learning; Signal processing algorithms; Text categorization; Training; Vocabulary; CHCFW method; Extended Bayes Model; assign weight; dependency type; feature selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pervasive Computing Signal Processing and Applications (PCSPA), 2010 First International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4244-8043-2
Electronic_ISBN :
978-0-7695-4180-8
Type :
conf
DOI :
10.1109/PCSPA.2010.66
Filename :
5635946
Link To Document :
بازگشت