Title :
CFEDR: Control-flow error detection and recovery using encoded signatures monitoring
Author :
Lanfang Tan ; Ying Tan ; Jianjun Xu
Author_Institution :
Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
Abstract :
The incorporation of error detection and recovery mechanisms becomes mandatory as the probability of the occurrence of transient faults increases. The detection of control flow errors has been extensively investigated in literature. However, only few works have been conducted towards recovery from control-flow errors. Generally, a program is re-executed after error detection. Although re-execution prevents faults from corrupting data, it does not allow the application to run to completion correctly in the presence of an error. Moreover, the overhead of re-execution increases prominently. The current study presents a pure-software method based on encoded signatures to recover from control-flow errors. Unlike general signature monitoring techniques, the proposed method targets not only interblock transitions, but also intrablock and inter-function transitions. After detecting the illegal transition, the program flow transfers back to the block where the error occurred, and the data errors caused by the error propagation are recovered. Fault injection and performance overhead experiments are performed to evaluate the proposed method. The experimental results show that most control flow errors can be recovered with relatively low performance overhead.
Keywords :
digital signatures; fault diagnosis; program control structures; software fault tolerance; system recovery; CFEDR; control-flow error detection and recovery; data errors; encoded signatures monitoring; error propagation; fault injection; general signature monitoring techniques; illegal transition detection; inter-function transitions; interblock transitions; intrablock transitions; probability; program flow; software method; transient faults occurrence; Computers; Educational institutions; Fault tolerance; Fault tolerant systems; Monitoring; Registers; Transient analysis;
Conference_Titel :
Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), 2013 IEEE International Symposium on
Conference_Location :
New York City, NY
Print_ISBN :
978-1-4799-1583-5
DOI :
10.1109/DFT.2013.6653578