DocumentCode
2556723
Title
Sequencegram: n-gram modeling of system calls for program based anomaly detection
Author
Hubballi, Neminath ; Biswas, Santosh ; Nandi, Sukumar
Author_Institution
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Guwahati, India
fYear
2011
fDate
4-8 Jan. 2011
Firstpage
1
Lastpage
10
Abstract
Our contribution in this paper is two fold. First we provide preliminary investigation results establishing program based anomaly detection is effective if short system call sequences are modeled along with their occurrence frequency. Second as a consequence of this, built normal program model can tolerate some level of contamination in the training dataset. We describe an experimental system Sequencegram, designed to validate the contributions. Sequencegram model short sequences of system calls in the form of n-grams and store in a tree (for the space efficiency) called as n-gram-tree. A score known as anomaly score is associated with every short sequence (based on its occurrence frequency) which represents the probability of short sequence being anomalous. As it is generally assumed that, there is a skewed distribution of normal and abnormal sequences, more frequently occurring sequences are given lower anomaly score and vice versa. Individual n-gram anomaly score contribute to the anomaly score of a program trace.
Keywords
security of data; sequences; telecommunication security; abnormal sequences; program based anomaly detection; sequencegram model short sequences; system calls; training dataset; Clustering algorithms; Intrusion detection; Mathematical model; Monitoring; Testing; Training; Windows; Intrusion detection system; Program based anomaly detection; n-gram based analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Communication Systems and Networks (COMSNETS), 2011 Third International Conference on
Conference_Location
Bangalore
Print_ISBN
978-1-4244-8952-7
Electronic_ISBN
978-1-4244-8951-0
Type
conf
DOI
10.1109/COMSNETS.2011.5716416
Filename
5716416
Link To Document