DocumentCode :
238859
Title :
Multilingual e-mail classification using Bayesian filtering and language translation
Author :
Banday, M. Tariq ; Sheikh, Shafiya Afzal
Author_Institution :
Dept. of Electron., Univ. of Kashmir, Srinagar, India
fYear :
2014
fDate :
27-29 Nov. 2014
Firstpage :
696
Lastpage :
701
Abstract :
E-mail SPAM is continuously growing threat to its users, E-mail Service Providers (ESPs) and Internet Service Providers (ISPs) as it consumes user´s mailboxes, bandwidth, and time by flooding the system with unwanted and unsolicited messages. It can contain unsafe content such as virus programs, phishing frauds, and other malicious code that can be used to hatch varied types of attacks. Several techniques and tools including anti-spam filters are employed to filter out spam e-mails at servers and clients. This paper reviews methods and techniques used to filter spam e-mails currently employed at major e-mail service providers and evaluates their performance to filter non-English language e-mail messages. It proposes a technique to build a translation module that can be used to augment current spam filters to enable them to filter spam from non-English language e-mail messages. It permits the spam filter to train itself through training data set in chosen language and tune its parameters with every incoming message. The implementation of the technique through a translation module and experiments using a publicly available e-mail data corpus have successfully validated the correctness and working of the proposed technique.
Keywords :
Internet; computer crime; language translation; pattern classification; unsolicited e-mail; Bayesian filtering; ESP; ISP; Internet service providers; antispam filters; e-mail data corpus; e-mail service providers; e-mail spam; language translation; malicious code; multilingual e-mail classification; nonEnglish language e-mail messages; phishing frauds; spam filters; unsolicited messages; unwanted messages; virus programs; Bayes methods; Filtering; Google; Postal services; Training; Unsolicited electronic mail; E-mail; Filtering; HAM; Multilingual; Online Language Translation; SPAM;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Contemporary Computing and Informatics (IC3I), 2014 International Conference on
Conference_Location :
Mysore
Type :
conf
DOI :
10.1109/IC3I.2014.7019788
Filename :
7019788
Link To Document :
بازگشت