• DocumentCode
    3425723
  • Title

    Dynamic classifier selection using clustering for spam detection

  • Author

    Saeedian, Mehrnoush Famil ; Beigy, Hamid

  • Author_Institution
    Comput. Eng. Dept., Sharif Univ. of Technol., Tehran
  • fYear
    2009
  • fDate
    March 30 2009-April 2 2009
  • Firstpage
    84
  • Lastpage
    88
  • Abstract
    Most e-mail users have encountered with spam problems, which have been addressed as a text classification or categorization problem. In this paper, we propose a novel spam detection method that uses ensemble of classifiers based on clustering and selection techniques. There is diversity in genre of e-mail´s content and this method can find different topics in emails by clustering. It first computes disjoint clusters of emails, and then a classifier is trained on each cluster. When new email arrives, its cluster is identified. The classifier of the identified cluster is selected to classify the new email. Our method can extract many kinds of topics in emails. The evaluation shows that the algorithm outperforms majority voting.
  • Keywords
    pattern clustering; security of data; unsolicited e-mail; dynamic classifier selection; e-mail; spam detection clustering; text classification; Bagging; Buffer storage; Computers; Content addressable storage; Decision making; Decision trees; History; Internet; Intrusion detection; Training data; classification; classifier selection; clustering; ensemble; spam;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Data Mining, 2009. CIDM '09. IEEE Symposium on
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    978-1-4244-2765-9
  • Type

    conf

  • DOI
    10.1109/CIDM.2009.4938633
  • Filename
    4938633