Title :
Experimental analysis of the first order time difference of indicators used in the monitoring of complex systems
Author :
Bondavalli, Andrea ; Brancati, Francesco ; Ceccarelli, Andrea ; Santoro, Diego ; Vadursi, Michele
Author_Institution :
Dept. of Math. & Comput. Sci., Univ. of Firenze, Florence, Italy
Abstract :
Complex and real time systems often operate under variable and non-stationary conditions, thus requiring efficient and extensive monitoring and error detection solutions. Amongst the many, we focus on anomaly detection techniques, which require measuring the evolution of the monitored indicators through time to identify anomalies i.e., deviations from the expected operational behavior. In this paper, we investigate the possibility to model the evolution of indicators through time using the random walk model. In particular, we focus on the detection of system anomalies at the application level (software errors), based on the monitoring of indicators at the Operating System level. The approach is based on the experimental evaluation of a large set of heterogeneous indicators, acquired under different operating conditions, both in terms of workload and fault load, on an air traffic management target system. The results of the analysis show that for a large number of cases, the histogram of the first order time differences well approximates a Gaussian distribution, independently of the nature of the indicator and its statistical distribution. Such outcomes suggest that the idea of adopting a Gaussian random walk model for several monitoring indicators has an experimental support and deserves be further investigated on a wider scale, in order to determine its range of applicability and representativeness.
Keywords :
Gaussian distribution; error detection; large-scale systems; operating systems (computers); security of data; system monitoring; Gaussian distribution; Gaussian random walk model; air traffic management target system; application level; complex system monitoring; error detection solutions; expected operational behavior; fault load; first order time difference experimental analysis; heterogeneous indicators; monitored indicator evolution measurement; nonstationary conditions; operating system level; real-time systems; software errors; statistical distribution; system anomaly detection techniques; variable conditions; Histograms; Linux; Monitoring; Servers; Software; Synchronization; Throughput; anomaly detection; random walk; system monitoring; systems of systems;
Conference_Titel :
Measurements and Networking Proceedings (M&N), 2013 IEEE International Workshop on
Conference_Location :
Naples
Print_ISBN :
978-1-4673-2873-9
DOI :
10.1109/IWMN.2013.6663792