• DocumentCode
    1354469
  • Title

    Correlated Failures in Fault-Tolerant Computers

  • Author

    Hecht, Herbert ; Dussault, Heather

  • Author_Institution
    SoHaR Incorporated, Los Angeles, 1040 S. La Jolla Ave.; Los Angeles, California 90035 USA.
  • Issue
    2
  • fYear
    1987
  • fDate
    6/1/1987 12:00:00 AM
  • Firstpage
    171
  • Lastpage
    175
  • Abstract
    In two repairable ground-based fault-tolerant computer systems in which constraints on switchover time permitted manual switching as a back-up the correlated failures were an important cause of system outage. In one of the systems a distinction could be made between outages that occurred when one computer was undergoing scheduled maintenance and outages that occurred while one computer was being repaired. The failure rate of the active computer was at least four times higher in the latter case. Several possible causes are described but could not be confirmed from the available data. In some situations, correlated failures call for a reliability model different than the commonly described models for imperfect coverage.
  • Keywords
    Computer errors; Degradation; Design engineering; Dictionaries; Fault tolerance; Fault tolerant systems; Probability; Reliability engineering; Systems engineering and theory; Time factors; Correlated failures; Fault-tolerant computing; Reliability model;
  • fLanguage
    English
  • Journal_Title
    Reliability, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9529
  • Type

    jour

  • DOI
    10.1109/TR.1987.5222334
  • Filename
    5222334