DocumentCode :
3256257
Title :
Using DBSCAN clustering algorithm in spam identifying
Author :
Ying, Wu ; Kai, Yang ; Jianzhong, Zhang
Author_Institution :
Dept. of Comput. Sci., Nankai Univ., Tianjin, China
Volume :
1
fYear :
2010
fDate :
22-24 June 2010
Abstract :
In the field of internet research, anti-spam mechanism has become a focus currently. The identification of spam plays an important role in current anti-spam mechanism. In order to identify spam efficiently, it usually needs to be able to identify similar emails, i.e. spam clustering. Using the present methods to cluster the emails, many similar emails will be clustered into several groups. For improving the accuracy of spam identification, we present a new clustering method which is based on the DBSCAN clustering algorithm and nilsimsa digest algorithm. Using this method, all emails identified similar artificially are clustered together. The result of the simulation shows that the clustering method based on DBSCAN and nilsimsa performs with higher clustering accuracy than the other clustering methods. From the simulation result, we can also conclude that the shape of the spam digest subspace is irregular.
Keywords :
Internet; information analysis; unsolicited e-mail; DBSCAN clustering; Internet research; spam identification; Clustering algorithms; Clustering methods; Computer science; Computer science education; Data mining; Educational technology; IP networks; Internet; Shape; Unsolicited electronic mail; DBSCAN; cluster; nilsimsa; spam;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Education Technology and Computer (ICETC), 2010 2nd International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-6367-1
Type :
conf
DOI :
10.1109/ICETC.2010.5529221
Filename :
5529221
Link To Document :
بازگشت