DocumentCode
3008784
Title
Classification of email using BeaKS: Behavior and keyword stemming
Author
Bhat, Veena H. ; Malkani, Vandana R. ; Shenoy, P. Deepa ; Venugopal, K.R. ; Patnaik, L.M.
Author_Institution
Dept. of CSE, Univ. Visvesvaraya, Bangalore, India
fYear
2011
fDate
21-24 Nov. 2011
Firstpage
1139
Lastpage
1143
Abstract
Spam mails are one of the greatest challenges faced by internet service providers, organizations and internet users in unison. Spam mails may be targeted, with a malicious intent or just as a commercial marketing activity - on the whole unwanted by everyone except the dispatcher. Spam filters continuously evolve as spammers go techno-savvy and creative. Machine learning algorithms have been popularly used for classifying and predicting mails as spam or ham (the good emails). This work presents a spam filter, BeaKS, with a focused preprocessing phase that weaves both the content of the email and two behavioral characteristics extracted from the email, to predict the category a mail belongs to: spam or ham. The accuracy of the proposed prediction model using Random Forests as the classifier is shown to be superior over other recent techniques. This approach is simple, easy to implement and reliable.
Keywords
learning (artificial intelligence); pattern classification; security of data; unsolicited e-mail; BeaKS; behavioral characteristics; classifier; email classification; ham mails; keyword stemming; machine learning algorithms; random forests; spam filters; spam mails; Accuracy; Artificial neural networks; Feature extraction; Niobium; Postal services; Unsolicited electronic mail; Email classification; email content; machine learning; random forests; spammer behaviour;
fLanguage
English
Publisher
ieee
Conference_Titel
TENCON 2011 - 2011 IEEE Region 10 Conference
Conference_Location
Bali
ISSN
2159-3442
Print_ISBN
978-1-4577-0256-3
Type
conf
DOI
10.1109/TENCON.2011.6129290
Filename
6129290
Link To Document