• DocumentCode
    2957897
  • Title

    Effective spam classification based on meta-heuristics

  • Author

    Yeh, Chi-yuan ; Wu, Chih-Hung ; Doong, Shing-Hwang

  • Author_Institution
    Dept. of Inf. Manage., Shu-Te Univ., Kaohsiung, Taiwan
  • Volume
    4
  • fYear
    2005
  • fDate
    10-12 Oct. 2005
  • Firstpage
    3872
  • Abstract
    Using machine learning techniques such as naive Bayes, decision trees and support vector machines to automatically filter out spam e-mails has drawn many researchers´ attention. Previous methods use keywords contained in e-mails to extract binary features from the corpus. However, since keywords of e-mails change from time to time, the performance of keyword-based solution is not stable. In this study, we use behaviors of spammers as the features for classifying e-mails. Such behaviors are first described by meta-heuristics and used as features of e-mails for classification. A total of 113 new features are extracted from the given meta-heuristics. Using existing machine learning techniques, the filtering performance is much better than that using keyword-based filtering. In addition, the training time is substantially reduced because of the low dimensional feature space and sparse feature vectors.
  • Keywords
    belief networks; decision trees; learning (artificial intelligence); support vector machines; unsolicited e-mail; binary feature extraction; decision trees; keyword-based filtering; low dimensional feature space; machine learning; meta-heuristics; naive Bayes; spam classification; spam e-mail; sparse feature vector; support vector machine; Bayesian methods; Decision trees; Electronic mail; Feature extraction; Genetic programming; Information filtering; Information filters; Machine learning; Support vector machines; Unsolicited electronic mail; Naïve Bayesian; classification; decision trees; machine learning; meta-heuristics; spam; support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2005 IEEE International Conference on
  • Print_ISBN
    0-7803-9298-1
  • Type

    conf

  • DOI
    10.1109/ICSMC.2005.1571750
  • Filename
    1571750