• DocumentCode
    2415846
  • Title

    Feature selection for Spam and Phishing detection

  • Author

    Toolan, Fergus ; Carthy, Joe

  • Author_Institution
    UCD Centre for Cybercrime Investig., Univ. Coll. Dublin, Dublin, Ireland
  • fYear
    2010
  • fDate
    18-20 Oct. 2010
  • Firstpage
    1
  • Lastpage
    12
  • Abstract
    Unsolicited Bulk Email (UBE) has become a large problem in recent years. The number of mass mailers in existence is increasing dramatically. Automatically detecting UBE has become a vital area of current research. Many email clients (such as Outlook and Thunderbird) already have junk filters built in. Mass mailers are continually evolving and overcoming some of the junk filters. This means that the need for research in the area is ongoing. Many existing techniques seem to randomly choose the features that will be used for classification. This paper aims to address this issue by investigating the utility of over 40 features that have been used in recent literature. Information gain for these features are calculated over Ham, Spam and Phishing corpora.
  • Keywords
    computer crime; e-mail filters; unsolicited e-mail; Ham corpora; feature selection; junk filters; phishing detection; spam detection; unsolicited bulk email; Equations; Feature extraction; HTML; IP networks; Suspensions; Unsolicited electronic mail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    eCrime Researchers Summit (eCrime), 2010
  • Conference_Location
    Dallas, TX
  • ISSN
    2159-1237
  • Print_ISBN
    978-1-4244-7760-9
  • Type

    conf

  • DOI
    10.1109/ecrime.2010.5706696
  • Filename
    5706696