Title :
Chinese Spam Filter Based on Relaxed Online Support Vector Machine
Author :
Han, Yong ; He, Xiaoning ; Yang, Muyun ; Qi, Haoliang ; Song, Chao
Author_Institution :
Sch. of Comput. Sci. & Technol., Harbin Inst. of Technol., Harbin, China
Abstract :
Spam filtering is a classical online learning problem. When the size of training sample set becomes larger and larger, the speed of Online SVM is becoming slower and slower. Therefore, we relax the constraints of Online SVM and get the Relaxed Online SVM (ROSVM) model, which can not only improve the speed, but also can ensure the performance. In this paper, we applied this model to Chinese spam filter. Our model outperforms the best system of TREC 2006 Chinese spam filter track. Our filter also participated in the SEWM 2010 spam filter track, and got the best 1-ROCA% of the delayed feedback task and the active learning task.
Keywords :
information filtering; support vector machines; unsolicited e-mail; SEWM 2010 spam filter track; TREC 2006 Chinese spam filter track; online learning problem; relaxed online support vector machine; spam filtering; Feature extraction; Filtering; Machine learning algorithms; Support vector machines; Training; Unsolicited electronic mail; Chinese spam filtering; Relaxed online SVM; online learning;
Conference_Titel :
Asian Language Processing (IALP), 2010 International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4244-9063-9
DOI :
10.1109/IALP.2010.90