DocumentCode
1566561
Title
Bayesian Additive Regression Trees-Based Spam Detection for Enhanced Email Privacy
Author
Abu-Nimeh, Saeed ; Nappa, Dario ; Wang, Xinlei ; Nair, Suku
Author_Institution
SMU HACNet Lab., Southern Methodist Univ., Dallas, TX
fYear
2008
Firstpage
1044
Lastpage
1051
Abstract
Spam is considered an invasion of privacy. Its changeable structures and variability raise the need for new spam classification techniques. The present study proposes using Bayesian additive regression trees (BART) for spam classification and evaluates its performance against other classification methods, including logistic regression, support vector machines, classification and regression trees, neural networks, random forests, and naive Bayes. BART in its original form is not designed for such problems, hence we modify BART and make it applicable to classification problems. We evaluate the classifiers using three spam datasets; Ling-Spam, PU1, and Spambase to determine the predictive accuracy and the false positive rate.
Keywords
Bayes methods; data privacy; pattern classification; regression analysis; security of data; trees (mathematics); unsolicited e-mail; Bayesian additive regression tree; email privacy; spam classification technique; spam detection; Accuracy; Bayesian methods; Classification tree analysis; Logistics; Neural networks; Privacy; Regression tree analysis; Support vector machine classification; Support vector machines; Unsolicited electronic mail; BART; CART; NNet; SVM; classification; logistic regression; random forests; spam;
fLanguage
English
Publisher
ieee
Conference_Titel
Availability, Reliability and Security, 2008. ARES 08. Third International Conference on
Conference_Location
Barcelona
Print_ISBN
978-0-7695-3102-1
Type
conf
DOI
10.1109/ARES.2008.136
Filename
4529459
Link To Document