Title :
Multilingual e-mail classification using Bayesian filtering and language translation
Author :
Banday, M. Tariq ; Sheikh, Shafiya Afzal
Author_Institution :
Dept. of Electron., Univ. of Kashmir, Srinagar, India
Abstract :
E-mail SPAM is continuously growing threat to its users, E-mail Service Providers (ESPs) and Internet Service Providers (ISPs) as it consumes user´s mailboxes, bandwidth, and time by flooding the system with unwanted and unsolicited messages. It can contain unsafe content such as virus programs, phishing frauds, and other malicious code that can be used to hatch varied types of attacks. Several techniques and tools including anti-spam filters are employed to filter out spam e-mails at servers and clients. This paper reviews methods and techniques used to filter spam e-mails currently employed at major e-mail service providers and evaluates their performance to filter non-English language e-mail messages. It proposes a technique to build a translation module that can be used to augment current spam filters to enable them to filter spam from non-English language e-mail messages. It permits the spam filter to train itself through training data set in chosen language and tune its parameters with every incoming message. The implementation of the technique through a translation module and experiments using a publicly available e-mail data corpus have successfully validated the correctness and working of the proposed technique.
Keywords :
Internet; computer crime; language translation; pattern classification; unsolicited e-mail; Bayesian filtering; ESP; ISP; Internet service providers; antispam filters; e-mail data corpus; e-mail service providers; e-mail spam; language translation; malicious code; multilingual e-mail classification; nonEnglish language e-mail messages; phishing frauds; spam filters; unsolicited messages; unwanted messages; virus programs; Bayes methods; Filtering; Google; Postal services; Training; Unsolicited electronic mail; E-mail; Filtering; HAM; Multilingual; Online Language Translation; SPAM;
Conference_Titel :
Contemporary Computing and Informatics (IC3I), 2014 International Conference on
Conference_Location :
Mysore
DOI :
10.1109/IC3I.2014.7019788