Title :
On the probability of detecting data errors generated by permanent faults using time redundancy
Author :
Aidemark, Joakim ; Folkesson, Peter ; Karlsson, Johan
Author_Institution :
Comput. Eng. Dept., Chalmers Univ. of Technol., Goteborg, Sweden
Abstract :
Time redundant execution of tasks and comparison of results is a well-known technique for detecting transient faults in computer systems. However, time redundancy is also capable of detecting permanent faults that occur during or between the executions of two task replicas, provided the faults affect the results of the two tasks in different ways. In this paper, we derive an expression for estimating the probability of detecting data errors generated by permanent faults with time redundant execution. The expression is validated experimentally by injecting permanent stuck-at faults into a multiplier unit of a microprocessor. We use the derived expression to show how tasks can be scheduled to improve the detection probability of errors generated by permanent faults. We also show that the detection capability of permanent faults is low for the Temporal Error Masking (TEM) technique (i.e. triplicated execution and voting to mask transient faults) and may not be increased by scheduling. Thus, we propose complementing TEM with special test tasks.
Keywords :
embedded systems; error detection; fault tolerant computing; transient analysis; Thor microprocessor; computer systems; data error; detection capability; detection probability; embedded systems; error detection mechanism; mask transient faults; multiplier unit; permanent faults; permanent stuck-at faults; task replicas; temporal error masking; time redundancy; time redundant execution; triplicated execution; triplicated voting; Arithmetic; Computer errors; Error correction; Fault detection; Kernel; Microprocessors; Processor scheduling; Redundancy; Testing; Voting;
Conference_Titel :
On-Line Testing Symposium, 2003. IOLTS 2003. 9th IEEE
Print_ISBN :
0-7695-1968-7
DOI :
10.1109/OLT.2003.1214369