Title :
Design and analysis of an optimal instruction-retry policy for TMR controller computers
Author :
Kim, Hagbae ; Shin, Kang G.
Author_Institution :
Dept. of Electr. Eng., Yonsei Univ., Seoul, South Korea
fDate :
11/1/1996 12:00:00 AM
Abstract :
An instruction-retry policy is proposed to enhance the fault-tolerance of triple modular redundant (TMR) controller computers by adding time redundancy to them. A TMR failure is said to occur if a TMR system fails to establish a majority among its modules´ outputs due to multiple faulty modules or a faulty voter. Either multiple consecutive TMR failures the active period of which exceeds a certain time limit or the exhaustion of spares as a result of frequent system reconfigurations may result in failure to meet the timing constraints of one or more tasks, called the dynamic failure, during a given mission. An optimal instruction-retry period is derived by minimizing the probability of dynamic failure upon detection of either a masked (by the TMR) error or a TMR failure. We also derive the minimum number of spares needed to keep below the pre-specified level the probability of dynamic failure for a given mission by using the derived optimal retry period
Keywords :
computerised control; fault tolerant computing; multiprocessing systems; probability; real-time systems; reconfigurable architectures; redundancy; reliability; TMR controller computers; common-cause faults; dynamic failure; external faults; fault-tolerance; hard deadlines; masked errors; multiple faulty modules; optimal instruction-retry policy; optimal retry period; reconfiguration; system reconfigurations; triple modular redundant controller computers; Computer aided instruction; Computer errors; Control systems; Fault tolerant systems; Optimal control; Process control; Real time systems; Redundancy; Strain control; Timing;
Journal_Title :
Computers, IEEE Transactions on