DocumentCode
2149661
Title
Dependency-aware fault diagnosis with metric-correlation models in enterprise software systems
Author
Jiang, Miao ; Munawar, Mohammad A. ; Reidemeister, Thomas ; Ward, Paul A S
Author_Institution
E&CE Dept., Univ. of Waterloo, Waterloo, ON, Canada
fYear
2010
fDate
25-29 Oct. 2010
Firstpage
134
Lastpage
141
Abstract
The normal operation of enterprise software systems can be modeled by stable correlations between various system metrics; errors are detected when some of these correlations fail to hold. The typical approach to diagnosis (i.e., pinpoint the faulty component) based on the correlation models is to use the Jaccard coefficient or some variant thereof, without reference to system structure, dependency data, or prior fault data. In this paper we demonstrate the intrinsic limitations of this approach, and propose a solution that mitigates these limitations. We assume knowledge of dependencies between components in the system, and take this information into account when analyzing the correlation models. We also propose the use of the Tanimoto coefficient instead of the Jaccard coefficient to assign anomaly scores to components. We evaluate our new algorithm with a Trade6-based test-bed. We show that we can find the faulty components within top-3 components with the highest anomaly score in four out of nine cases, while the prior method can only find one.
Keywords
business data processing; fault diagnosis; Jaccard coefficient; Tanimoto coefficient; Trade6-based test-bed; dependency-aware fault diagnosis; enterprise software systems; metric-correlation models; Availability; Correlation; Fault diagnosis; Measurement; Monitoring; Software systems; Time factors;
fLanguage
English
Publisher
ieee
Conference_Titel
Network and Service Management (CNSM), 2010 International Conference on
Conference_Location
Niagara Falls, ON
Print_ISBN
978-1-4244-8910-7
Electronic_ISBN
978-1-4244-8908-4
Type
conf
DOI
10.1109/CNSM.2010.5691319
Filename
5691319
Link To Document