DocumentCode :
263311
Title :
Automated generation of ham rules for Vietnamese spam filtering
Author :
Quan Dang Dinh ; Quang Anh Tran ; Jiang, Frank
Author_Institution :
Fac. of Inf. Technol., Hanoi Univ., Hanoi, Vietnam
fYear :
2014
fDate :
14-17 Dec. 2014
Firstpage :
1
Lastpage :
5
Abstract :
The topic of spam filtering has been thoroughly studied by researchers in the past few decades. There has been successful works with high spam detection rates, yet no paper has described a method which can effectively detect spam and, at the same time, measure the importance of ham emails. In this paper, the authors propose a method of generating SpamAssassin rules which can indicate the degree of importance of an email message. Specifically we added a proportion of negatively weighted ham rules and adapted HPSOWM, an efficient evolutionary algorithm, to optimize SpamAssassin rule scores. As a result, using our new rule set, SpamAssassin is able to give indicative scores for both spam and ham. These scores can be utilized by email clients to categorize incoming messages based on their importance to user. Various experiments were conducted to evaluate our method. In addition, a conclusion was drawn about the best ratio of spam rules and ham rules.
Keywords :
e-mail filters; evolutionary computation; information filtering; natural language processing; SpamAssassin rule; Vietnamese spam filtering; adapted HPSOWM; automated generation; email client; email message; evolutionary algorithm; ham email; negatively weighted ham rules; spam detection rate; Accuracy; Educational institutions; Error analysis; Training; Unsolicited electronic mail; HPSOWM; SpamAssassin; automated anti-spam rules; ham rules; spam filtering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence for Security and Defense Applications (CISDA), 2014 Seventh IEEE Symposium on
Conference_Location :
Hanoi
Type :
conf
DOI :
10.1109/CISDA.2014.7035628
Filename :
7035628
Link To Document :
بازگشت