DocumentCode :
2752139
Title :
Comparison of a SOM based sequence analysis system and naive Bayesian classifier for spam filtering
Author :
Luo, Xiao ; Zincir-Heywood, Nur
Author_Institution :
Fac. of Comput. Sci., Dalhousie Univ., Halifax, NS, Canada
Volume :
4
fYear :
2005
fDate :
July 31 2005-Aug. 4 2005
Firstpage :
2571
Abstract :
The problem introduced by the unsolicited bulk emails, also known as "spam" generates a need for reliable anti-spam filters. In this paper, we design and compare the performance of a newly designed SOM based sequence analysis (SBSA) system for the spam filtering task. The system is based on a SOM based sequential data representation combined with a kNN classifier designed to make use of word sequence information. We compare this system with the traditional baseline method naive Bayesian filter. Three different cost scenarios and suitable cost-sensitive measurements are employed. The results show that the SBSA system is superior to the naive Bayesian filter, particularly when the misclassification cost for non-spam message is high.
Keywords :
belief networks; data structures; filtering theory; self-organising feature maps; unsolicited e-mail; SOM based sequence analysis system; k-nearest neighbor; naive Bayesian classifier; self-organizing featured maps; spam filtering; Bayesian methods; Classification algorithms; Costs; Electronic mail; Filtering; Filters; Machine learning; Performance analysis; Postal services; Self organizing feature maps;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
0-7803-9048-2
Type :
conf
DOI :
10.1109/IJCNN.2005.1556308
Filename :
1556308
Link To Document :
بازگشت