Title :
Text classification based on the TAN model
Author :
Hong-Bo, SHI ; Zhi-Hai, Wang ; Hou-Kuan, Huang ; Li-Ping, Jing
Author_Institution :
Sch. of Comput. & Inf. Technol., Northern Jiaotong Univ., Beijing, China
Abstract :
This paper proposes a text classification method based on TAN model. Naive Bayesian classifier is the most effective and popular text classification method, but its attribute independence assumption makes it unable to express the dependence among text terms. TAN (Tree Augmented Naive Bayes) combines the simplicity of Naive Bayesian with the ability to express the dependence among attributes in Bayesian network. This paper reviews some existing text methods, introduces TAN model, and applies TAN model to text classification. Naive Bayesian and TAN classifiers are also compared by our experiments. Experimental results show TAN classifier has better performance.
Keywords :
belief networks; data mining; feature extraction; learning (artificial intelligence); Bayesian network; Naive Bayesian classifier; TAN model; data mining; feature selection; machine; text classification; tree augmented Naive Bayes method; Bayesian methods; Data mining; Distributed computing; Machine learning; NP-complete problem; Probability distribution; Supervised learning; Testing; Text categorization;
Conference_Titel :
TENCON '02. Proceedings. 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering
Print_ISBN :
0-7803-7490-8
DOI :
10.1109/TENCON.2002.1181210