DocumentCode
1852206
Title
Training anti-spam models with smaller training set via SVM way
Author
Diao, LiLi ; Yang, Chengzhong
Author_Institution
Core-Technol., Trend Micro Inc., Nanjing, China
Volume
2
fYear
2010
fDate
1-3 Aug. 2010
Abstract
In internet era, though emails turn into one of the most popular way for communication, spam emails also bother people seriously. As a result, research on email filtering has become a hot topic with much effort put into this area. Unfortunately, in the real-world application, the large-scale training email dataset which differs from the assumption made in experiment challenges both efficiency and effectiveness. Thus, a new promising method to filter emails is in need. In this paper, we propose an SVM based machine learning method to compress the training set with minimal information loss. The key process is that we reduce large-scale training email set according to the distribution of Support Vectors produced by SVM training. Then a compressed training set is obtained and makes a great contribution to saving time and keeping precision in generating anti-spam models. Experiments show that trained anti-spam classifier can get a better performance by applying our compressing approach.
Keywords
Internet; information filtering; support vector machines; unsolicited e-mail; Internet era; SVM based machine learning method; anti-spam models; email filtering; smaller training set; Electronic mail; Filtering; Learning systems; Machine learning; Redundancy; Support vector machines; Training; SVM(support vector machine); email filter; machine learning; training set shrink;
fLanguage
English
Publisher
ieee
Conference_Titel
Electronics and Information Engineering (ICEIE), 2010 International Conference On
Conference_Location
Kyoto
Print_ISBN
978-1-4244-7679-4
Electronic_ISBN
978-1-4244-7681-7
Type
conf
DOI
10.1109/ICEIE.2010.5559725
Filename
5559725
Link To Document