DocumentCode
3203265
Title
Text classification based on the TAN model
Author
Hong-Bo, SHI ; Zhi-Hai, Wang ; Hou-Kuan, Huang ; Li-Ping, Jing
Author_Institution
Sch. of Comput. & Inf. Technol., Northern Jiaotong Univ., Beijing, China
Volume
1
fYear
2002
fDate
28-31 Oct. 2002
Firstpage
43
Abstract
This paper proposes a text classification method based on TAN model. Naive Bayesian classifier is the most effective and popular text classification method, but its attribute independence assumption makes it unable to express the dependence among text terms. TAN (Tree Augmented Naive Bayes) combines the simplicity of Naive Bayesian with the ability to express the dependence among attributes in Bayesian network. This paper reviews some existing text methods, introduces TAN model, and applies TAN model to text classification. Naive Bayesian and TAN classifiers are also compared by our experiments. Experimental results show TAN classifier has better performance.
Keywords
belief networks; data mining; feature extraction; learning (artificial intelligence); Bayesian network; Naive Bayesian classifier; TAN model; data mining; feature selection; machine; text classification; tree augmented Naive Bayes method; Bayesian methods; Data mining; Distributed computing; Machine learning; NP-complete problem; Probability distribution; Supervised learning; Testing; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering
Print_ISBN
0-7803-7490-8
Type
conf
DOI
10.1109/TENCON.2002.1181210
Filename
1181210
Link To Document