DocumentCode
722517
Title
Discovering similar malware samples using API call topics
Author
Fujino, Akinori ; Murakami, Junichi ; Mori, Tatsuya
Author_Institution
Dept. of Commun. Eng., Waseda Univ., Tokyo, Japan
fYear
2015
fDate
9-12 Jan. 2015
Firstpage
140
Lastpage
147
Abstract
To automate malware analysis, dynamic malware analysis systems have attracted increasing attention from both the industry and research communities. Of the various logs collected by such systems, the API call is a very promising source of information for characterizing malware behavior. This work aims to extract similar malware samples automatically using the concept of “API call topics,” which represents a set of API calls that are intrinsic to a specific group of malware samples. We first convert Win32 API calls into “API words.” We then apply non-negative matrix factorization (NMF) clustering analysis to the corpus of the extracted API words. NMF automatically generates the API call topics from the API words. The contributions of this work can be summarized as follows. We present an unsupervised approach to extract API call topics from a large corpus of API calls. Through analysis of the API call logs collected from thousands of malware samples, we demonstrate that the extracted API call topics can detect similar malware samples. The proposed approach is expected to be useful for automating the process of analyzing a huge volume of logs collected from dynamic malware analysis systems.
Keywords
application program interfaces; invasive software; matrix decomposition; pattern clustering; API call logs; API call topics; API words; NMF; Win32 API calls; dynamic malware analysis systems; malware behavior; negative matrix factorization clustering analysis; similar malware samples; unsupervised approach;
fLanguage
English
Publisher
ieee
Conference_Titel
Consumer Communications and Networking Conference (CCNC), 2015 12th Annual IEEE
Conference_Location
Las Vegas, NV
ISSN
2331-9860
Print_ISBN
978-1-4799-6389-8
Type
conf
DOI
10.1109/CCNC.2015.7157960
Filename
7157960
Link To Document