• DocumentCode
    2844155
  • Title

    An empirical performance comparison of machine learning methods for spam e-mail categorization

  • Author

    Lai, Chih-Chin ; Tsai, Ming-Chi

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Univ. of Tainan, Taiwan
  • fYear
    2004
  • fDate
    5-8 Dec. 2004
  • Firstpage
    44
  • Lastpage
    48
  • Abstract
    The increasing volume of unsolicited bulk e-mail (also known as spam) has generated a need for reliable antispam filters. Using a classifier based on machine learning techniques to automatically filter out spam e-mail has drawn many researchers´ attention. In this paper, we review some of relevant ideas and do a set of systematic experiments on e-mail categorization, which has been conducted with four machine learning algorithms applied to different parts of e-mail. Experimental results reveal that the header of e-mail provides very useful information for all the machine learning algorithms considered to detect spam e-mail.
  • Keywords
    information filters; learning (artificial intelligence); unsolicited e-mail; antispam filters; e-mail categorization; machine learning; spam; unsolicited bulk e-mail; Electronic mail; Filtering; Filters; Learning systems; Machine learning; Machine learning algorithms; Niobium; Support vector machine classification; Support vector machines; Unsolicited electronic mail; e-mail categorization; machine learning; spam;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hybrid Intelligent Systems, 2004. HIS '04. Fourth International Conference on
  • Print_ISBN
    0-7695-2291-2
  • Type

    conf

  • DOI
    10.1109/ICHIS.2004.21
  • Filename
    1409979