Title :
The application of LSI truncation in spam filtering dimension reduction
Author_Institution :
Sch. of Inf. Eng., East China JiaoTong Univ., Nanchang, China
Abstract :
In this paper, the issue of dimension reduction in order to reduce the computational effort for spam filtering is studied, which is particularly important for time-sensitive online spam problem. In particular, to solve this problem, the effects of truncation of the singular value decomposition in latent semantic index are investigated and compared. It is shown that, in the context of spam filtering, a surprisingly large amount of problem reduction is often possible under the proposed method without heavy loss in filter performance.
Keywords :
indexing; information filtering; singular value decomposition; unsolicited e-mail; LSI truncation; latent semantic index; singular value decomposition; spam filtering dimension reduction; time-sensitive online spam problem; Electronic mail; Feature extraction; Filtering; Large scale integration; Matrix decomposition; Semantics; Support vector machine classification; Dimension reduction; LSI truncation; Spam filtering;
Conference_Titel :
Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
Conference_Location :
Qingdao
Print_ISBN :
978-1-4244-6526-2
DOI :
10.1109/ICMLC.2010.5580684