DocumentCode
3425723
Title
Dynamic classifier selection using clustering for spam detection
Author
Saeedian, Mehrnoush Famil ; Beigy, Hamid
Author_Institution
Comput. Eng. Dept., Sharif Univ. of Technol., Tehran
fYear
2009
fDate
March 30 2009-April 2 2009
Firstpage
84
Lastpage
88
Abstract
Most e-mail users have encountered with spam problems, which have been addressed as a text classification or categorization problem. In this paper, we propose a novel spam detection method that uses ensemble of classifiers based on clustering and selection techniques. There is diversity in genre of e-mail´s content and this method can find different topics in emails by clustering. It first computes disjoint clusters of emails, and then a classifier is trained on each cluster. When new email arrives, its cluster is identified. The classifier of the identified cluster is selected to classify the new email. Our method can extract many kinds of topics in emails. The evaluation shows that the algorithm outperforms majority voting.
Keywords
pattern clustering; security of data; unsolicited e-mail; dynamic classifier selection; e-mail; spam detection clustering; text classification; Bagging; Buffer storage; Computers; Content addressable storage; Decision making; Decision trees; History; Internet; Intrusion detection; Training data; classification; classifier selection; clustering; ensemble; spam;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
Conference_Location
Nashville, TN
Print_ISBN
978-1-4244-2765-9
Type
conf
DOI
10.1109/CIDM.2009.4938633
Filename
4938633
Link To Document