Towards identifying OS-level anomalies to detect application software failures

Author

Bovenzi, Antonio ; Russo, Stefano ; Brancati, Francesco ; Bondavalli, Andrea

Author_Institution

Dipt. di Inf. e Sist. (DIS), Univ. degli Studi di Napoli Federico II, Naples, Italy

fYear

2011

fDate

10-11 Oct. 2011

Firstpage

71

Lastpage

76

Abstract

The next generation of critical systems, namely complex Critical Infrastructures (LCCIs), require efficient runtime management, reconfiguration strategies, and the ability to take decisions on the basis of current and past behavior of the system. Anomaly-based detection, leveraging information gathered at Operating System (OS) level (e.g., number of system call errors, signals, and holding semaphores in the time unit), seems to be a promising approach to reveal online application faults. Recently an experimental campaign to evaluate the performance of two anomaly detection algorithms was performed on a case study from the Air Traffic Management (ATM) domain, deployed under the popular OS used in the production environment, i.e., Red Hat 5 EL. In this paper we investigate the impact of the OS and the monitored resources on the quality of the detection, by executing experiments on Windows Server 2008. Experimental results allow identifying which of the two operating systems provides monitoring facilities best suited to implement the anomaly detection algorithms that we have considered. Moreover numerical sensitivity analysis of the detector parameters is carried out to understand the impact of their setting on the performance.

Keywords

operating systems (computers); safety-critical software; sensitivity analysis; software performance evaluation; system monitoring; ATM domain; LCCI; OS level; OS-level anomaly; Windows Server 2008; air traffic management domain; anomaly detection algorithms; anomaly-based detection; complex critical infrastructures; critical systems; detector parameters; monitored resources; monitoring facility; next generation; online application faults; operating system level; operating systems; performance evaluation; production environment; reconfiguration strategy; runtime management; semaphores; sensitivity analysis; software failures; Accuracy; Detectors; Instruction sets; Linux; Measurement; Monitoring; Servers; OS-level monitoring; anomaly detection; software failure;

fLanguage

English

Publisher

ieee

Conference_Titel

Measurements and Networking Proceedings (M&N), 2011 IEEE International Workshop on

Conference_Location

Anacapri

Print_ISBN

978-1-4577-0455-0

Type

conf

DOI

10.1109/IWMN.2011.6088494

Filename

6088494