Title :
Spam Detection Using Dynamic Weighted Voting Based on Clustering
Author :
Saeedian, Mehrnoush Famil ; Beigy, Hamid
Author_Institution :
Dept. of Comput. Eng., Sharif Univ. of Technol., Tehran
Abstract :
In the last decade spam detection has been addressed as a text classification or categorization problem. In this paper we propose a new dynamic weighted voting method based on the combination of clustering and weighted voting, and apply it to the task of spam filtering. In order to classify a new sample, it first compares with all cluster centroids and its similarity to each cluster is identified; Classifiers in the vicinity of the input sample obtain greater weight for the final decision of the ensemble. The evaluation shows that the algorithm outperforms pure SVM.
Keywords :
e-mail filters; pattern classification; pattern clustering; security of data; unsolicited e-mail; clustering; dynamic weighted voting method; spam detection; spam filtering; text categorization problem; text classification problem; Filtering; Filters; Machine learning algorithms; Nearest neighbor searches; Niobium; Support vector machine classification; Support vector machines; Training data; Unsolicited electronic mail; Voting; classification; classifier fusion; clustering; ensemble; spam;
Conference_Titel :
Intelligent Information Technology Application, 2008. IITA '08. Second International Symposium on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3497-8
DOI :
10.1109/IITA.2008.140