Title :
Sequencegram: n-gram modeling of system calls for program based anomaly detection
Author :
Hubballi, Neminath ; Biswas, Santosh ; Nandi, Sukumar
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Guwahati, India
Abstract :
Our contribution in this paper is two fold. First we provide preliminary investigation results establishing program based anomaly detection is effective if short system call sequences are modeled along with their occurrence frequency. Second as a consequence of this, built normal program model can tolerate some level of contamination in the training dataset. We describe an experimental system Sequencegram, designed to validate the contributions. Sequencegram model short sequences of system calls in the form of n-grams and store in a tree (for the space efficiency) called as n-gram-tree. A score known as anomaly score is associated with every short sequence (based on its occurrence frequency) which represents the probability of short sequence being anomalous. As it is generally assumed that, there is a skewed distribution of normal and abnormal sequences, more frequently occurring sequences are given lower anomaly score and vice versa. Individual n-gram anomaly score contribute to the anomaly score of a program trace.
Keywords :
security of data; sequences; telecommunication security; abnormal sequences; program based anomaly detection; sequencegram model short sequences; system calls; training dataset; Clustering algorithms; Intrusion detection; Mathematical model; Monitoring; Testing; Training; Windows; Intrusion detection system; Program based anomaly detection; n-gram based analysis;
Conference_Titel :
Communication Systems and Networks (COMSNETS), 2011 Third International Conference on
Conference_Location :
Bangalore
Print_ISBN :
978-1-4244-8952-7
Electronic_ISBN :
978-1-4244-8951-0
DOI :
10.1109/COMSNETS.2011.5716416