• DocumentCode
    3696263
  • Title

    Research and Improvement of a Spam Filter Based on Naive Bayes

  • Author

    Lin Li;Chi Li

  • Author_Institution
    Dept. of Comput. Sci. &
  • Volume
    2
  • fYear
    2015
  • Firstpage
    361
  • Lastpage
    364
  • Abstract
    The spam filter based on Naive Bayes algorithm, which has good classification accuracy, but the training and learning mail sample sets takes a lot of resources, affects the overall efficiency of the system, so we should select the features of the message text in the practical application, and thus to reduce the dimension of the features vector space. TF-IDF is commonly used as a text feature selection, the method is simple, the paper improve the IDF weighting algorithm of the TF-IDF feature selection, increase the weight of the high frequency words corresponding its class, use the improved TF-IDF algorithm to select the features, and build a naive Bayesian spam filter improved TF-IDF feature weighting.
  • Keywords
    "Postal services","Classification algorithms","Filtering","Filtering algorithms","Vocabulary","Electronic mail","Mathematical model"
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Human-Machine Systems and Cybernetics (IHMSC), 2015 7th International Conference on
  • Print_ISBN
    978-1-4799-8645-3
  • Type

    conf

  • DOI
    10.1109/IHMSC.2015.208
  • Filename
    7334988