• DocumentCode
    1566561
  • Title

    Bayesian Additive Regression Trees-Based Spam Detection for Enhanced Email Privacy

  • Author

    Abu-Nimeh, Saeed ; Nappa, Dario ; Wang, Xinlei ; Nair, Suku

  • Author_Institution
    SMU HACNet Lab., Southern Methodist Univ., Dallas, TX
  • fYear
    2008
  • Firstpage
    1044
  • Lastpage
    1051
  • Abstract
    Spam is considered an invasion of privacy. Its changeable structures and variability raise the need for new spam classification techniques. The present study proposes using Bayesian additive regression trees (BART) for spam classification and evaluates its performance against other classification methods, including logistic regression, support vector machines, classification and regression trees, neural networks, random forests, and naive Bayes. BART in its original form is not designed for such problems, hence we modify BART and make it applicable to classification problems. We evaluate the classifiers using three spam datasets; Ling-Spam, PU1, and Spambase to determine the predictive accuracy and the false positive rate.
  • Keywords
    Bayes methods; data privacy; pattern classification; regression analysis; security of data; trees (mathematics); unsolicited e-mail; Bayesian additive regression tree; email privacy; spam classification technique; spam detection; Accuracy; Bayesian methods; Classification tree analysis; Logistics; Neural networks; Privacy; Regression tree analysis; Support vector machine classification; Support vector machines; Unsolicited electronic mail; BART; CART; NNet; SVM; classification; logistic regression; random forests; spam;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Availability, Reliability and Security, 2008. ARES 08. Third International Conference on
  • Conference_Location
    Barcelona
  • Print_ISBN
    978-0-7695-3102-1
  • Type

    conf

  • DOI
    10.1109/ARES.2008.136
  • Filename
    4529459