• DocumentCode
    3544224
  • Title

    Filtering spam e-mail with Generalized Additive Neural Networks

  • Author

    Du Toit, Tiny ; Kruger, Hennie

  • Author_Institution
    Sch. of Comput., Stat. & Math. Sci., North-West Univ., Potchefstroom, South Africa
  • fYear
    2012
  • fDate
    15-17 Aug. 2012
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Some of the major security risks associated with spam e-mail are the spreading of computer viruses and the facilitation of phishing exercises. Spam is therefore regarded as one of the prominent security threats in modern organizations. Security controls, such as spam filtering techniques, have become increasingly important to protect information and information assets. In this paper the performance of a Generalized Additive Neural Network on a publicly available e-mail corpus is investigated in the context of statistical spam filtering. The neural network is compared to a Naive Bayesian classifier and a Memory-based technique. Generalized Additive Neural Networks have a number of advantages compared to neural networks in general. An automated construction algorithm performs feature and model selection simultaneously and produces results which can be interpreted by a graphical method. This algorithm is powerful, effective and performs highly accurate compared to other non-linear model selection methods. The paper also considers the impact of different feature set sizes using cost-sensitive measures. These criteria are sensitive to the cost difference between two common types of errors made by filtering systems. Experiments show better performance compared to the Naive Bayes and Memory-based classifiers where legitimate e-mails are assigned the same cost as spams. This result suggests Generalized Additive Neural Networks may be utilized to flag spam e-mails in order to prioritize the reading of messages.
  • Keywords
    Bayes methods; computer viruses; information filtering; neural nets; unsolicited e-mail; Naive Bayesian classifier; automated construction algorithm; computer viruses; e-mail corpus; filtering spam e-mail; generalized additive neural networks; graphical method; information assets; information protection; memory based technique; phishing exercises; security controls; security risks; security threats; spam filtering techniques; statistical spam filtering; Accuracy; Additives; Bayesian methods; Biological neural networks; Unsolicited electronic mail; Generalized Additive Neural Network; Memorybased classifier; Naive Bayesian classifier; Neural Network; Security risk; Spam; Spam filtering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Security for South Africa (ISSA), 2012
  • Conference_Location
    Johannesburg, Gauteng
  • Print_ISBN
    978-1-4673-2160-0
  • Type

    conf

  • DOI
    10.1109/ISSA.2012.6320446
  • Filename
    6320446