DocumentCode :
121225
Title :
Spam filtering techniques and MapReduce with SVM: A study
Author :
Kakade, Amol G. ; Kharat, Prashant K. ; Gupta, Amit Kumar ; Batra, Tushar
Author_Institution :
Dept. of Inf. Technol., Walchand Coll. of Eng., Sangli, India
fYear :
2014
fDate :
10-12 Feb. 2014
Firstpage :
59
Lastpage :
64
Abstract :
Spam is the most dangerous threat to email systems today. Spam is any unwanted and harmful mail. Separation of spam from normal mails is essential. This paper surveys different spam filtering techniques, Support Vector Machine (SVM) training problems and need to introduce MapReduce Hadoop to train SVM. Techniques to separate spam mails are word based, content based, machine learning based and hybrid. Machine learning techniques are most popular because of high accuracy and mathematical support. SVM is the mostly used machine learning based technique in the spam filtering process because its ability to handle data with large attribute. Hurdles in training of SVM are, large time requirement and large dataset can´t be given as an input. These both problems can be solved by implementing the training algorithm on MapReduce (Hadoop) framework which gives up to 6 times speedup than sequential algorithm.
Keywords :
data handling; support vector machines; unsolicited e-mail; MapReduce Hadoop; SVM training; content based; data handling; email systems; machine learning based; spam filtering techniques; spam separation; support vector machine training problems; word based; Artificial neural networks; Electronic mail; Filtering; Postal services; Servers; Support vector machines; Training; Content based; Machine learning based; MapReduce; Spam filtering techniques; Support Vector Machine (SVM); Word based;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Aided System Engineering (APCASE), 2014 Asia-Pacific Conference on
Conference_Location :
South Kuta
Print_ISBN :
978-1-4799-4570-2
Type :
conf
DOI :
10.1109/APCASE.2014.6924472
Filename :
6924472
Link To Document :
بازگشت