Title :
The application of decision tree in Chinese email classification
Author :
Chen, Hao ; Zhan, Yan ; Li, Yan
Author_Institution :
Key Lab. of Machine Learning & Comput. Intell., Hebei Univ., Baoding, China
Abstract :
Email is a kind of semi-structured document, some important attributes are contained in its structure, and especially using spam-specific features could improve the email classification results. In this paper, we apply decision tree data mining technique to dig out the potential association rules among these attributes of email, and then to identify unknown email´s category based on these rules. According to the experiment of applying numerous Chinese emails to our email classifier, the efficiency of our method is not lower than that of other existing methods of checking whole email content text. Meanwhile our method can reduce the cost of computation and consumption of system resources.
Keywords :
classification; data mining; decision trees; electronic mail; natural language processing; Chinese email classification; association rules; data mining; decision tree; semi-structured document; Association rules; Classification algorithms; Classification tree analysis; Electronic mail; Machine learning; Postal services; Association rule mining; Decision tree; Email classification; Spam-specific feature;
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-1-4244-6526-2
DOI :
10.1109/ICMLC.2010.5581046