• DocumentCode
    623273
  • Title

    Classification of malware families based on N-grams sequential pattern features

  • Author

    Liangboonprakong, Chatchai ; Sornil, Ohm

  • Author_Institution
    Dept. of Comput. Sci., Suan Sunandha Rajabhat Univ., Bangkok, Thailand
  • fYear
    2013
  • fDate
    19-21 June 2013
  • Firstpage
    777
  • Lastpage
    782
  • Abstract
    Malware family identification is a complex process involving extraction of distinctive characteristics from a set of malware samples. Malware authors employ various techniques to prevent the identification of unique characteristics of their programs, such as, encryption and obfuscation. In this paper, we present n-gram based sequential features extracted from content of the files. N-grams are extracted from files; sequential n-gram patterns are determined; pattern statistics are calculated and reduced by the sequential floating forward selection method; and a classifier is used to determine the family of malware. Three classification models: C4.5, multilayer perceptron, and support vector machine are studied. Experimental results on a standard malware test collection show that the proposed method performs well, with the classification accuracy of 96.64%.
  • Keywords
    invasive software; multilayer perceptrons; pattern classification; statistical analysis; support vector machines; C4.5 classification model; complex process; encryption characteristics; file content; malware family classification; malware family identification; multilayer perceptron classification model; n-gram-based sequential feature extraction; obfuscation characteristics; pattern statistics; sequential floating forward selection method; standard malware test collection; support vector machine classification model; Accuracy; Artificial neural networks; Classification algorithms; Decision trees; Feature extraction; Malware; Support vector machines; C4.5; Malware Classification; Multilayer Perceptron; N-Gram; Sequential Floating Forward Selection; Sequential Pattern; Support Vector Machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Industrial Electronics and Applications (ICIEA), 2013 8th IEEE Conference on
  • Conference_Location
    Melbourne, VIC
  • Print_ISBN
    978-1-4673-6320-4
  • Type

    conf

  • DOI
    10.1109/ICIEA.2013.6566472
  • Filename
    6566472