DocumentCode
3696263
Title
Research and Improvement of a Spam Filter Based on Naive Bayes
Author
Lin Li;Chi Li
Author_Institution
Dept. of Comput. Sci. &
Volume
2
fYear
2015
Firstpage
361
Lastpage
364
Abstract
The spam filter based on Naive Bayes algorithm, which has good classification accuracy, but the training and learning mail sample sets takes a lot of resources, affects the overall efficiency of the system, so we should select the features of the message text in the practical application, and thus to reduce the dimension of the features vector space. TF-IDF is commonly used as a text feature selection, the method is simple, the paper improve the IDF weighting algorithm of the TF-IDF feature selection, increase the weight of the high frequency words corresponding its class, use the improved TF-IDF algorithm to select the features, and build a naive Bayesian spam filter improved TF-IDF feature weighting.
Keywords
"Postal services","Classification algorithms","Filtering","Filtering algorithms","Vocabulary","Electronic mail","Mathematical model"
Publisher
ieee
Conference_Titel
Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2015 7th International Conference on
Print_ISBN
978-1-4799-8645-3
Type
conf
DOI
10.1109/IHMSC.2015.208
Filename
7334988
Link To Document