Title :
Evaluation and design of an ultra-reliable distributed architecture for fault tolerance
Author :
Walter, Chris J.
Author_Institution :
Allied-Signal Aerosp. Technol. Center, Columbia, MD, USA
fDate :
10/1/1990 12:00:00 AM
Abstract :
The issues related to the experimental evaluation of an early conceptual prototype of the MAFT (multicomputer architecture for fault tolerance) architecture are discussed. A completely automated testing approach was designed to allow fault-injection experiments to be performed, including stuck-at and memory faults. Over 2000 injection tests were run and the system successfully tolerated all faults. Concurrent with the experimental evaluation, an analytic evaluation was carried out to determine if higher levels of reliability could be achieved. The lessons learned in the evaluation phase culminated in a new design of the MAFT architecture for applications needing ultrareliability. The design uses the concept of redundantly self-checking functions to address the rigid requirements proposed for a future generation of mission-critical avionics. The testing of three subsystems critical to the operation of the new MAFT is presented with close to 50-k test cycles performed over 51 different IC devices to verify the designs
Keywords :
automatic testing; computer testing; distributed processing; fault tolerant computing; IC devices; MAFT; automated testing; fault tolerance; fault-injection experiments; memory faults; mission-critical avionics; redundantly self-checking functions; stuck at faults; ultra-reliable distributed architecture; Communication system control; Fault tolerance; Fault tolerant systems; Integrated circuit testing; Performance evaluation; Probability; Processor scheduling; Prototypes; Redundancy; System testing;
Journal_Title :
Reliability, IEEE Transactions on