DocumentCode
3234286
Title
Identifying junk electronic mail in Microsoft outlook with a support vector machine
Author
Woitaszek, Matthew ; Shaaban, Muhammad ; Czernikowski, Roy
Author_Institution
Colorado Univ., Boulder, CO, USA
fYear
2003
fDate
27-31 Jan. 2003
Firstpage
166
Lastpage
169
Abstract
In this paper, we utilize a simple support vector machine to identify commercial electronic mail. The use of a personalized dictionary for model training provided a classification accuracy of 96.69%, while a much larger system dictionary achieved 95.26%. The classification system was subsequently implemented as an add-in for Microsoft Outlook XP, providing sorting and grouping capabilities using Outlook´s interface to the typical desktop e-mail user.
Keywords
document handling; electronic mail; Microsoft Outlook XP; classification accuracy; commercial electronic mail; desktop e-mail user; grouping capabilities; model training; personalized dictionary; simple support vector machine; sorting capabilities; spam; unsolicited commercial electronic mail; Dictionaries; Electronic mail; Filtering; Internet; Large-scale systems; Privacy; Sorting; Support vector machine classification; Support vector machines; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Applications and the Internet, 2003. Proceedings. 2003 Symposium on
Print_ISBN
0-7695-1872-9
Type
conf
DOI
10.1109/SAINT.2003.1183045
Filename
1183045
Link To Document