Title :
A Unified Method for Analyzing Mission Reliability for Fault Tolerant Computer Systems
Author :
Bricker, Jacob L.
Author_Institution :
Hughes Aircraft Company, Fullerton, Calif. 92634.
fDate :
6/1/1973 12:00:00 AM
Abstract :
A reliability model is proposed and evaluated for a fault tolerant computer system which consists of multiple classes of modules and allows for degraded modes of performance. Each module of a given class has both an active and a passive hazard rate; constant hazard rates are assumed for active and dormant failures, and the given class may operate either in N Modular Redundancy (NMR: n + 1 out of 2n + 1 = N) or as a standby sparing system. The model allows for mission-phase changes at deterministic time points when the numbers of modules per class can be changed. The analysis proceeds by generalizing the notions of standby and NMR redundancy, which for N = 3 is TMR (Triple Modular Redundancy), into a concept called hybrid-degraded redundancy. The probabilistic evaluation of the unified redundancy concept is then developed to yield, for a given modular class, the joint distribution of success and the number of nonfailed modules from that class, at special times. With this information, a Markov chain analysis gives the reliability of an entire sequence of phases (mission profile).
Keywords :
Aircraft; Distributed computing; Fault tolerant systems; Hazards; Jacobian matrices; Logic; NASA; Nuclear magnetic resonance; Redundancy; Space stations;
Journal_Title :
Reliability, IEEE Transactions on
DOI :
10.1109/TR.1973.5216037