DocumentCode :
3022774
Title :
Automatic Reverse Engineering of Malware Emulators
Author :
Sharif, Monirul ; Lanzi, Andrea ; Giffin, Jonathon ; Lee, Wenke
Author_Institution :
Sch. of Comput. Sci., Georgia Inst. of Technol., Atlanta, GA, USA
fYear :
2009
fDate :
17-20 May 2009
Firstpage :
94
Lastpage :
109
Abstract :
Malware authors have recently begun using emulation technology to obfuscate their code. They convert native malware binaries into bytecode programs written in a randomly generated instruction set and paired with a native binary emulator that interprets the bytecode. No existing malware analysis can reliably reverse this obfuscation technique. In this paper, we present the first work in automatic reverse engineering of malware emulators. Our algorithms are based on dynamic analysis. We execute the emulated malware in a protected environment and record the entire x86 instruction trace generated by the emulator. We then use dynamic data-flow and taint analysis over the trace to identify data buffers containing the bytecode program and extract the syntactic and semantic information about the bytecode instruction set. With these analysis outputs, we are able to generate data structures, such as control-flow graphs, that provide the foundation for subsequent malware analysis. We implemented a proof-of-concept system called Rotalume and evaluated it using both legitimate programs and malware emulated by VMProtect and code virtualizer. The results show that Rotalume accurately reveals the syntax and semantics of emulated instruction sets and reconstructs execution paths of original programs from their bytecode representations.
Keywords :
data flow analysis; data structures; invasive software; reverse engineering; Rotalume; VMProtect; automatic reverse engineering; bytecode instruction set; bytecode programs; code virtualizer; control flow graphs; data structures; dynamic data flow analysis; dynamic taint analysis; emulation technology; malware emulators; proof-of-concept system; x86 instruction trace; Algorithm design and analysis; Computer buffers; Data analysis; Data mining; Data structures; Emulation; Information analysis; Instruction sets; Protection; Reverse engineering; Emulation; Malware Analysis; Obfuscation; Reverse-engineering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Security and Privacy, 2009 30th IEEE Symposium on
Conference_Location :
Berkeley, CA
ISSN :
1081-6011
Print_ISBN :
978-0-7695-3633-0
Type :
conf
DOI :
10.1109/SP.2009.27
Filename :
5207639
Link To Document :
بازگشت