DocumentCode
2415846
Title
Feature selection for Spam and Phishing detection
Author
Toolan, Fergus ; Carthy, Joe
Author_Institution
UCD Centre for Cybercrime Investig., Univ. Coll. Dublin, Dublin, Ireland
fYear
2010
fDate
18-20 Oct. 2010
Firstpage
1
Lastpage
12
Abstract
Unsolicited Bulk Email (UBE) has become a large problem in recent years. The number of mass mailers in existence is increasing dramatically. Automatically detecting UBE has become a vital area of current research. Many email clients (such as Outlook and Thunderbird) already have junk filters built in. Mass mailers are continually evolving and overcoming some of the junk filters. This means that the need for research in the area is ongoing. Many existing techniques seem to randomly choose the features that will be used for classification. This paper aims to address this issue by investigating the utility of over 40 features that have been used in recent literature. Information gain for these features are calculated over Ham, Spam and Phishing corpora.
Keywords
computer crime; e-mail filters; unsolicited e-mail; Ham corpora; feature selection; junk filters; phishing detection; spam detection; unsolicited bulk email; Equations; Feature extraction; HTML; IP networks; Suspensions; Unsolicited electronic mail;
fLanguage
English
Publisher
ieee
Conference_Titel
eCrime Researchers Summit (eCrime), 2010
Conference_Location
Dallas, TX
ISSN
2159-1237
Print_ISBN
978-1-4244-7760-9
Type
conf
DOI
10.1109/ecrime.2010.5706696
Filename
5706696
Link To Document