DocumentCode
2801371
Title
Bayesian Chinese Spam Filter Based on Crossed N-gram
Author
Dong, Jianshe ; Cao, Haixia ; Liu, Peng ; Ren, Li
Author_Institution
Lanzou University of Technology, China
Volume
3
fYear
2006
fDate
Oct. 2006
Firstpage
103
Lastpage
108
Abstract
Naive Bayesian spam email filters are a wellknown and powerful type of filters that can easily be induced from a dataset of sample cases. However, the problem of segmenting words for Chinese email restricts its performance. In this paper, we present a Bayesian Chinese spam filter based on cross N-gram. This method does not need to carry on segmenting words for Chinese emails, so that it can avoid to be restricted by inaccurate words segmenting. It also needn¿t to install segmenting word dictionary and is easy to install in the user terminal to construct an individualized spam filter since the space and time efficiency are improved. The restriction on independence assumption of naive bayes method is relaxed in some degree. The results of experiments show that the proposed method can acquire a high accuracy ratio at low cost.
Keywords
Bayesian methods; Computer networks; Data engineering; Dictionaries; Educational technology; Grid computing; Information filtering; Information filters; Military computing; Unsolicited electronic mail;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems Design and Applications, 2006. ISDA '06. Sixth International Conference on
Conference_Location
Jian, China
Print_ISBN
0-7695-2528-8
Type
conf
DOI
10.1109/ISDA.2006.17
Filename
4021867
Link To Document