DocumentCode :
1397287
Title :
System Monitoring with Metric-Correlation Models
Author :
Jiang, Miao ; Munawar, Mohammad A. ; Reidemeister, Thomas ; Ward, Paul A S
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Waterloo, Waterloo, ON, Canada
Volume :
8
Issue :
4
fYear :
2011
fDate :
12/1/2011 12:00:00 AM
Firstpage :
348
Lastpage :
360
Abstract :
Modern software systems expose management metrics to help track their health. Recently, it was demonstrated that correlations among these metrics allow errors to be detected and their causes localized. Prior research shows that linear models can capture many of these correlations. However, our research shows that several factors may prevent linear models from accurately describing correlations, even if the underlying relationship is linear. Common phenomena we have observed include relationships that evolve, relationships with missing variables, and heterogeneous residual variance of the correlated metrics. Usually these phenomena can be discovered by testing for heteroscedasticity of the underlying linear models. Such behaviour violates the assumptions of simple linear regression, which thus fail to describe system dynamics correctly. In this paper we address the above challenges by employing efficient variants of Ordinary Least Squares regression models. In addition, we automate the process of error detection by introducing the Wilcoxon Rank-Sum test after proper correlations modeling. We validate our models using a realistic Java-Enterprise-Edition application. Using fault-injection experiments we show that our improved models capture system behavior accurately.
Keywords :
error detection; least squares approximations; program testing; regression analysis; software fault tolerance; software metrics; system monitoring; Java-Enterprise-Edition application; Wilcoxon rank-sum test; correlations modeling; error detection; fault injection experiment; heterogeneous residual variance; heteroscedasticity testing; linear model; linear regression; metric-correlation model; ordinary least squares regression model; software system management metrics; system dynamics; system monitoring; Adaptation models; Computational modeling; Correlation; Measurement; Monitoring; Predictive models; Software systems; System monitoring; fault detection; heteroscedasticity; metric-correlation models; multi-variable correlations; recursive least squares;
fLanguage :
English
Journal_Title :
Network and Service Management, IEEE Transactions on
Publisher :
ieee
ISSN :
1932-4537
Type :
jour
DOI :
10.1109/TNSM.2011.120811.100033
Filename :
6102277
Link To Document :
بازگشت