Title :
Online spam filtering using support vector machines
Author :
Amayri, Ola ; Bouguila, Nizar
Author_Institution :
Concordia Inst. for Inf. Syst. Eng., Concordia Univ., Montreal, QC, Canada
Abstract :
The majority of used kernels in SVMs concern continuous data, and neglect the structure of the text. In contrast to classical kernels, we propose the use of various string kernels for spam filtering. On the other hand, data preprocessing is a vital part of text classification where the objective is to generate feature vectors usable by SVM kernels. We detail a feature mapping variant in text classification (TC) that yields improved performance for the standard SVM in filtering task. Furthermore, we propose an online active framework for spam filtering.
Keywords :
information filters; support vector machines; unsolicited e-mail; data preprocessing; feature mapping variant; feature vectors; online spam filtering; string kernels; support vector machines; text classification; Frequency; Information filtering; Information filters; Information systems; Kernel; Machine learning; Support vector machine classification; Support vector machines; Text categorization; Unsolicited electronic mail; Active; Feature Mapping; Online; Spam filtering; Support Vector Machines; Transductive Support Vector Machines;
Conference_Titel :
Computers and Communications, 2009. ISCC 2009. IEEE Symposium on
Conference_Location :
Sousse
Print_ISBN :
978-1-4244-4672-8
Electronic_ISBN :
1530-1346
DOI :
10.1109/ISCC.2009.5202287