• DocumentCode
    3003396
  • Title

    Identifying Image Spam based on Header and File Properties using C4.5 Decision Trees and Support Vector Machine Learning

  • Author

    Krasser, Sven ; Tang, Yuchun C. ; Gould, Jeremy ; Alperovitch, Dmitri ; Judge, Paul

  • Author_Institution
    Secure Comput. Corp., Alpharetta
  • fYear
    2007
  • fDate
    20-22 June 2007
  • Firstpage
    255
  • Lastpage
    261
  • Abstract
    Image spam poses a great threat to email communications due to high volumes, bigger bandwidth requirements, and higher processing requirements for filtering. We present a feature extraction and classification framework that operates on features that can be extracted from image files in a very fast fashion. The features considered are thoroughly analyzed regarding their information gain. We present classification performance results for C4.5 decision tree and support vector machine classifiers. Lastly, we compare the performance that can be achieved using these fast features to a more complex image classifier operating on morphological features extracted from fully decoded images. The proposed classifier is able to detect a large amount of malicious images while being computationally inexpensive.
  • Keywords
    decision trees; feature extraction; image classification; support vector machines; unsolicited e-mail; C4.5 decision trees; decoded images; email communications; feature extraction; file properties; filtering; header properties; image classifier; image spam; malicious images; morphological features; support vector machine learning; Bandwidth; Classification tree analysis; Data mining; Decision trees; Electronic mail; Feature extraction; Machine learning; Support vector machine classification; Support vector machines; Unsolicited electronic mail;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Assurance and Security Workshop, 2007. IAW '07. IEEE SMC
  • Conference_Location
    West Point, NY
  • Print_ISBN
    1-4244-1304-4
  • Electronic_ISBN
    1-4244-1304-4
  • Type

    conf

  • DOI
    10.1109/IAW.2007.381941
  • Filename
    4267569