DocumentCode :
3696263
Title :
Research and Improvement of a Spam Filter Based on Naive Bayes
Author :
Lin Li;Chi Li
Author_Institution :
Dept. of Comput. Sci. &
Volume :
2
fYear :
2015
Firstpage :
361
Lastpage :
364
Abstract :
The spam filter based on Naive Bayes algorithm, which has good classification accuracy, but the training and learning mail sample sets takes a lot of resources, affects the overall efficiency of the system, so we should select the features of the message text in the practical application, and thus to reduce the dimension of the features vector space. TF-IDF is commonly used as a text feature selection, the method is simple, the paper improve the IDF weighting algorithm of the TF-IDF feature selection, increase the weight of the high frequency words corresponding its class, use the improved TF-IDF algorithm to select the features, and build a naive Bayesian spam filter improved TF-IDF feature weighting.
Keywords :
"Postal services","Classification algorithms","Filtering","Filtering algorithms","Vocabulary","Electronic mail","Mathematical model"
Publisher :
ieee
Conference_Titel :
Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2015 7th International Conference on
Print_ISBN :
978-1-4799-8645-3
Type :
conf
DOI :
10.1109/IHMSC.2015.208
Filename :
7334988
Link To Document :
بازگشت