DocumentCode
822930
Title
Modeling and Tracking of Transaction Flow Dynamics for Fault Detection in Complex Systems
Author
Jiang, Guofei ; Chen, Haifeng ; Yoshihira, Kenji
Author_Institution
NEC Labs. America Inc., Princeton, NJ
Volume
3
Issue
4
fYear
2006
Firstpage
312
Lastpage
326
Abstract
With the prevalence of Internet services and the increase of their complexity, there is a growing need to improve their operational reliability and availability. While a large amount of monitoring data can be collected from systems for fault analysis, it is hard to correlate this data effectively across distributed systems and observation time. In this paper, we analyze the mass characteristics of user requests and propose a novel approach to model and track transaction flow dynamics for fault detection in complex information systems. We measure the flow intensity at multiple checkpoints inside the system and apply system identification methods to model transaction flow dynamics between these measurements. With the learned analytical models, a model-based fault detection and isolation method is applied to track the flow dynamics in real time for fault detection. We also propose an algorithm to automatically search and validate the dynamic relationship between randomly selected monitoring points. Our algorithm enables systems to have self-cognition capability for system management. Our approach is tested in a real system with a list of injected faults. Experimental results demonstrate the effectiveness of our approach and algorithms
Keywords
Internet; data handling; distributed processing; fault diagnosis; information systems; monitoring; transaction processing; Internet service; complex information system; distributed system; fault analysis; fault detection; monitoring; operational availability; operational reliability; self-cognition capability; system identification; system management; transaction flow dynamics; Analytical models; Availability; Fault detection; Fault location; Fluid flow measurement; Information analysis; Information systems; Monitoring; System identification; Web and internet services; Fault detection; dynamic relationship; flow intensity and dynamics.; information systems; model validation; model-based FDI; regression model; system management;
fLanguage
English
Journal_Title
Dependable and Secure Computing, IEEE Transactions on
Publisher
ieee
ISSN
1545-5971
Type
jour
DOI
10.1109/TDSC.2006.52
Filename
4012644
Link To Document