DocumentCode
671701
Title
Application of string kernel based support vector machine for malware packer identification
Author
Ban, Toshinori ; Isawa, Ryoichi ; Shanqing Guo ; Inoue, Daisuke ; Nakao, Kengo
Author_Institution
Nat. Inst. of Inf. & Commun. Technol., Koganei, Japan
fYear
2013
fDate
4-9 Aug. 2013
Firstpage
1
Lastpage
8
Abstract
Packing is among the most popular obfuscation techniques to impede anti-virus scanners from successfully detecting malware. In this paper we propose a string-kernel-based support vector machine classifier to identify the packer that is used to create a given malware program. Our approach is featured by the following characteristics. First, the adoption of a string-kernel-based method bridges the gap between signature-based and machine-learning-base approaches. Second, the kernel function derived from the Levenshtein distance integrates important domain knowledge in the learning process. Then, application of support vector machine, a state-of-the-art classifier, enables an automated packer identification scheme with high generalization ability and time efficiency. Finally, selection of the code segment with the most essential packer relevant information further boosts the classification performance. Experiments on a dataset of 3228 binary programs composed of packed files created by 25 packers show that the proposed approach outperforms PEiD and previous machine-learning-based approaches in prediction accuracy with a large margin. This method can help to improve the scanning efficiency of anti-virus products and promote efficient back-end malware research.
Keywords
invasive software; learning (artificial intelligence); support vector machines; Levenshtein distance; automated packer identification scheme; kernel function; machine-learning-base approach; malware packer identification; malware program; obfuscation techniques; state-of-the-art classifier; string-kernel-based support vector machine classifier; Feature extraction; Kernel; Malware; Measurement; Sun; Support vector machines; Training;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks (IJCNN), The 2013 International Joint Conference on
Conference_Location
Dallas, TX
ISSN
2161-4393
Print_ISBN
978-1-4673-6128-6
Type
conf
DOI
10.1109/IJCNN.2013.6707043
Filename
6707043
Link To Document