• DocumentCode
    722517
  • Title

    Discovering similar malware samples using API call topics

  • Author

    Fujino, Akinori ; Murakami, Junichi ; Mori, Tatsuya

  • Author_Institution
    Dept. of Commun. Eng., Waseda Univ., Tokyo, Japan
  • fYear
    2015
  • fDate
    9-12 Jan. 2015
  • Firstpage
    140
  • Lastpage
    147
  • Abstract
    To automate malware analysis, dynamic malware analysis systems have attracted increasing attention from both the industry and research communities. Of the various logs collected by such systems, the API call is a very promising source of information for characterizing malware behavior. This work aims to extract similar malware samples automatically using the concept of “API call topics,” which represents a set of API calls that are intrinsic to a specific group of malware samples. We first convert Win32 API calls into “API words.” We then apply non-negative matrix factorization (NMF) clustering analysis to the corpus of the extracted API words. NMF automatically generates the API call topics from the API words. The contributions of this work can be summarized as follows. We present an unsupervised approach to extract API call topics from a large corpus of API calls. Through analysis of the API call logs collected from thousands of malware samples, we demonstrate that the extracted API call topics can detect similar malware samples. The proposed approach is expected to be useful for automating the process of analyzing a huge volume of logs collected from dynamic malware analysis systems.
  • Keywords
    application program interfaces; invasive software; matrix decomposition; pattern clustering; API call logs; API call topics; API words; NMF; Win32 API calls; dynamic malware analysis systems; malware behavior; negative matrix factorization clustering analysis; similar malware samples; unsupervised approach;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Consumer Communications and Networking Conference (CCNC), 2015 12th Annual IEEE
  • Conference_Location
    Las Vegas, NV
  • ISSN
    2331-9860
  • Print_ISBN
    978-1-4799-6389-8
  • Type

    conf

  • DOI
    10.1109/CCNC.2015.7157960
  • Filename
    7157960