• DocumentCode
    796430
  • Title

    Analysis and modeling of correlated failures in multicomputer systems

  • Author

    Tang, Dong ; Iyer, Ravishankar K.

  • Author_Institution
    Center for Reliable & High-Performance Comput., Illinois Univ., Urbana, IL, USA
  • Volume
    41
  • Issue
    5
  • fYear
    1992
  • fDate
    5/1/1992 12:00:00 AM
  • Firstpage
    567
  • Lastpage
    577
  • Abstract
    Based on the measurements from two DEC VAX-cluster multicomputer systems, the issue of correlated failures is addressed. In particular, the characteristics of correlated failures, their impact and their modelling on dependability, are discussed. It is found from the data that most correlated failures are related to errors in shared resources and propagate from one machine to another. Comparisons between measurement-based models and analytical models that assume failure independence show that the impact of correlated failures on dependability is significant. Two validated models. the c-dependent model and the p-dependent model, are developed to evaluate the dependability of systems with correlated failures
  • Keywords
    computation theory; fault tolerant computing; multiprocessing systems; DEC VAX-cluster; c-dependent model; correlated failures; dependability; multicomputer systems; p-dependent model; shared resources; Analytical models; Availability; Failure analysis; Fault tolerant systems; Independent component analysis; Information analysis; Markov processes; Performance analysis; Performance evaluation; Stress measurement;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/12.142683
  • Filename
    142683