مرکز منطقه ای اطلاع رساني علوم و فناوري - A Practical Guide for Detecting the Java Script-Based Malware Using Hidden Markov Models and Linear Classifiers

DocumentCode :

262037

Title :

A Practical Guide for Detecting the Java Script-Based Malware Using Hidden Markov Models and Linear Classifiers

Author :

Cosovan, Doina ; Benchea, Razvan ; Gavrilut, Dragos

Author_Institution :

Romania Bitdefender Anti-virus Res. Lab., Al.I. Cuza Univ. of Iasi, Iasi, Romania

fYear :

2014

fDate :

22-25 Sept. 2014

Firstpage :

236

Lastpage :

243

Abstract :

The World Wide Web evolved so rapidly that it is no longer considered a luxury, but a necessity. That is why currently the most popular infection vectors used by cyber criminals are either web pages or commonly used documents (such as pdf files). In both of these cases, the malicious actions performed are written in Java Script. Because of this, Java Script has become the preferred language for spreading malware. In order to be able to stop malicious content from executing, detection of its infection vector is crucial. In this paper we propose various methods for detecting Java Script-based attack vectors. For achieving our goal we first need to fight metamorphism techniques usually used in Java Script malicious code, which are by no means trivial: garbage instruction insertion, variable renaming, equivalent instruction substitution, function permutation, instruction reordering, and so on. Our approach to deal with metamorphism starts with splitting the Java Script content in components and filtering the insignificant ones. We then use a data set, consisting in over one million Java Script files in order to test several machine learning algorithms such as Hidden Markov Models, linear classifiers and hybrid approaches for malware detection. Finally, we analyze these detection methods from a practical point of view, emphasizing the need for a very low false positive rate and the ability to be trained on large datasets.

Keywords :

Java; Web sites; hidden Markov models; invasive software; learning (artificial intelligence); pattern classification; vectors; JavaScript content; JavaScript files; JavaScript malicious code; JavaScript-Based malware detection; JavaScript-based attack vector detection; Web pages; World Wide Web; cybercriminals; equivalent instruction substitution; function permutation; garbage instruction insertion; hidden Markov models; infection vectors; instruction reordering; linear classifiers; machine learning algorithms; metamorphism techniques; variable renaming; Feature extraction; HTML; Hidden Markov models; Malware; Portable document format; Reactive power; Vectors; Hidden Markov Model; Java Script; Linear Classifier; Machine Learning; PDF; detection; infection vector; malware; metamorphism;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), 2014 16th International Symposium on

Conference_Location :

Timisoara

Print_ISBN :

978-1-4799-8447-3

Type :

conf

DOI :

10.1109/SYNASC.2014.39

Filename :

7034689

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=262037