• DocumentCode
    3234286
  • Title

    Identifying junk electronic mail in Microsoft outlook with a support vector machine

  • Author

    Woitaszek, Matthew ; Shaaban, Muhammad ; Czernikowski, Roy

  • Author_Institution
    Colorado Univ., Boulder, CO, USA
  • fYear
    2003
  • fDate
    27-31 Jan. 2003
  • Firstpage
    166
  • Lastpage
    169
  • Abstract
    In this paper, we utilize a simple support vector machine to identify commercial electronic mail. The use of a personalized dictionary for model training provided a classification accuracy of 96.69%, while a much larger system dictionary achieved 95.26%. The classification system was subsequently implemented as an add-in for Microsoft Outlook XP, providing sorting and grouping capabilities using Outlook´s interface to the typical desktop e-mail user.
  • Keywords
    document handling; electronic mail; Microsoft Outlook XP; classification accuracy; commercial electronic mail; desktop e-mail user; grouping capabilities; model training; personalized dictionary; simple support vector machine; sorting capabilities; spam; unsolicited commercial electronic mail; Dictionaries; Electronic mail; Filtering; Internet; Large-scale systems; Privacy; Sorting; Support vector machine classification; Support vector machines; Text categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applications and the Internet, 2003. Proceedings. 2003 Symposium on
  • Print_ISBN
    0-7695-1872-9
  • Type

    conf

  • DOI
    10.1109/SAINT.2003.1183045
  • Filename
    1183045