DocumentCode :
3314445
Title :
A MapReduce based parallel SVM for large scale spam filtering
Author :
Caruana, G. ; Maozhen Li ; Man Qi
Author_Institution :
Sch. of Eng. & Design, Brunel Univ., Uxbridge, UK
Volume :
4
fYear :
2011
fDate :
26-28 July 2011
Firstpage :
2659
Lastpage :
2662
Abstract :
Spam continues to inflict increased damage. Varying approaches including Support Vector Machine (SVM) based techniques have been proposed for spam classification. However, SVM training is a computationally intensive process. This paper presents a parallel SVM algorithm for scalable spam filtering. By distributing, processing and optimizing the subsets of the training data across multiple participating nodes, the distributed SVM reduces the training time significantly. Ontology based concepts are also employed to minimize the impact of accuracy degradation when distributing the training data amongst the SVM classifiers.
Keywords :
ontologies (artificial intelligence); parallel algorithms; pattern classification; support vector machines; unsolicited e-mail; MapReduce; SVM classifiers; large scale spam filtering; ontology; parallel SVM algorithm; spam classification; Accuracy; Filtering; Machine learning; Ontologies; Support vector machines; Training; Training data; Classification; Machine Learning; Ontology Semantics; Parallel Computing; Support Vector Machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
Type :
conf
DOI :
10.1109/FSKD.2011.6020074
Filename :
6020074
Link To Document :
بازگشت