DocumentCode
796430
Title
Analysis and modeling of correlated failures in multicomputer systems
Author
Tang, Dong ; Iyer, Ravishankar K.
Author_Institution
Center for Reliable & High-Performance Comput., Illinois Univ., Urbana, IL, USA
Volume
41
Issue
5
fYear
1992
fDate
5/1/1992 12:00:00 AM
Firstpage
567
Lastpage
577
Abstract
Based on the measurements from two DEC VAX-cluster multicomputer systems, the issue of correlated failures is addressed. In particular, the characteristics of correlated failures, their impact and their modelling on dependability, are discussed. It is found from the data that most correlated failures are related to errors in shared resources and propagate from one machine to another. Comparisons between measurement-based models and analytical models that assume failure independence show that the impact of correlated failures on dependability is significant. Two validated models. the c-dependent model and the p-dependent model, are developed to evaluate the dependability of systems with correlated failures
Keywords
computation theory; fault tolerant computing; multiprocessing systems; DEC VAX-cluster; c-dependent model; correlated failures; dependability; multicomputer systems; p-dependent model; shared resources; Analytical models; Availability; Failure analysis; Fault tolerant systems; Independent component analysis; Information analysis; Markov processes; Performance analysis; Performance evaluation; Stress measurement;
fLanguage
English
Journal_Title
Computers, IEEE Transactions on
Publisher
ieee
ISSN
0018-9340
Type
jour
DOI
10.1109/12.142683
Filename
142683
Link To Document