Title :
Incremental Adaptive Spam Mail Filtering Using Naïve Bayesian Classification
Author :
Taninpong, Phimphaka ; Ngamsuriyaroj, Sudsanguan
Author_Institution :
Dept. of Comput. Sci., Mahidol Univ., Bangkok, Thailand
Abstract :
Most content based spam filters are rule based or trained off-line. Handling new spam tactics is difficult and prone to high misclassification rate. This paper proposes an incremental adaptive spam mail filtering using Naiumlve Bayesian classification which gives good performance, simplicity and adaptability. We model an incremental scheme that receives a stream of emails, and applies the concept of sliding window to train only the last w emails for testing new incoming messages. Subsequently, the new features of tested messages are added to the existing features so that the model will be adaptive to future incoming emails. The proposed model is tested on two corpora: Trec05p-1 and Trec06p. The parameters are the window size and the number of features, and the evaluation metrics are the processing time per message, and the ham and spam misclassification rates. The experimental results show that the number of features has little impact whereas the window size has significant effects on misclassification rates and the processing time. In addition, the overall accuracy is even better than that obtained from the batch off-line training and the processing time is reduced significantly.
Keywords :
Bayes methods; e-mail filters; information filtering; learning (artificial intelligence); pattern classification; unsolicited e-mail; electronic mail; incoming message testing; incremental adaptive spam mail filtering; incremental learning; misclassification rate; naive Bayesian classification; rule-based spam filter; sliding window; trained off-line; Adaptive filters; Bayesian methods; Computer science; Information filtering; Information filters; Postal services; Support vector machine classification; Support vector machines; Testing; Unsolicited electronic mail; adaptive; incremental learning; naïve Bayesian classification; spam mail filtering;
Conference_Titel :
Software Engineering, Artificial Intelligences, Networking and Parallel/Distributed Computing, 2009. SNPD '09. 10th ACIS International Conference on
Conference_Location :
Daegu
Print_ISBN :
978-0-7695-3642-2
DOI :
10.1109/SNPD.2009.45