مرکز منطقه ای اطلاع رساني علوم و فناوري - A Safe Approach to Shrink Email Sample Set while Keeping Balance between Spam and Normal

DocumentCode :

2322756

Title :

A Safe Approach to Shrink Email Sample Set while Keeping Balance between Spam and Normal

Author :

Diao, LiLi ; Wang, Hao

Author_Institution :

Trend Micro Inc., Nanjing, China

fYear :

2009

fDate :

8-10 July 2009

Firstpage :

329

Lastpage :

334

Abstract :

To deal with any possible cases for training anti-spam machine learning models, it is crucial to design a safe way to shrink the size of training sample set via reducing redundancies with minimal information loss for classification as well as make distribution of samples balanced. Presently, there is no such solution to do so. In this paper, we propose a safe approach to address these problems and improve the quality of training email sample pool (set) for getting high quality machine learning models for better anti-spam engine with non-biased high spam detection rates as well as low false positive rates.

Keywords :

e-mail filters; learning (artificial intelligence); pattern classification; support vector machines; unsolicited e-mail; anti-spam engine; classification; email training sample pool; low false positive rates; nonbiased high spam detection rates; trained machine learning models; Conferences; Engines; Industry applications; Machine learning; Software safety; Software testing; Software tools; Support vector machine classification; Support vector machines; Unsolicited electronic mail; SVM; anti-spam; machine learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Secure Software Integration and Reliability Improvement, 2009. SSIRI 2009. Third IEEE International Conference on

Conference_Location :

Shanghai

Print_ISBN :

978-0-7695-3758-0

Type :

conf

DOI :

10.1109/SSIRI.2009.66

Filename :

5325354

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2322756