DocumentCode
1354469
Title
Correlated Failures in Fault-Tolerant Computers
Author
Hecht, Herbert ; Dussault, Heather
Author_Institution
SoHaR Incorporated, Los Angeles, 1040 S. La Jolla Ave.; Los Angeles, California 90035 USA.
Issue
2
fYear
1987
fDate
6/1/1987 12:00:00 AM
Firstpage
171
Lastpage
175
Abstract
In two repairable ground-based fault-tolerant computer systems in which constraints on switchover time permitted manual switching as a back-up the correlated failures were an important cause of system outage. In one of the systems a distinction could be made between outages that occurred when one computer was undergoing scheduled maintenance and outages that occurred while one computer was being repaired. The failure rate of the active computer was at least four times higher in the latter case. Several possible causes are described but could not be confirmed from the available data. In some situations, correlated failures call for a reliability model different than the commonly described models for imperfect coverage.
Keywords
Computer errors; Degradation; Design engineering; Dictionaries; Fault tolerance; Fault tolerant systems; Probability; Reliability engineering; Systems engineering and theory; Time factors; Correlated failures; Fault-tolerant computing; Reliability model;
fLanguage
English
Journal_Title
Reliability, IEEE Transactions on
Publisher
ieee
ISSN
0018-9529
Type
jour
DOI
10.1109/TR.1987.5222334
Filename
5222334
Link To Document