DocumentCode
2141624
Title
REE: Exploiting idempotent property of applications for fault detection and recovery
Author
Jianli Li ; Qingping Tan ; Lanfang Tan ; Tongchuan Xin
Author_Institution
Sch. of Comput., Nat. Univ. of Defense Technol., Changsha, China
fYear
2013
fDate
23-25 July 2013
Firstpage
1623
Lastpage
1627
Abstract
As semiconductor technologies scale down to deep sub-micron dimensions, transient faults will soon become a critical reliability concern. This paper presents the Reliability Enhancement Exploiting (REE) technique, a software-implemented fault tolerance solution which employs idempotent property of applications. An idempotent region of code is simply one that can be re-executed multiple times and still produces the same, correct result. By instrumenting extra instructions in an idempotent region to re-execute the region, REE can detect the transient faults occurring during the execution of the idempotent region. Once a fault is detected, REE can recover from the fault by executing the idempotent region again. To the best of our knowledge, this is the first to exploit idempotent property for fault detection. With similar fault coverage to a classic solution, the memory overhead and the performance overhead have been reduced by 71.8% and 31.3%, respectively.
Keywords
fault diagnosis; fault tolerance; integrated circuit reliability; semiconductor technology; REE; fault coverage; fault detection; fault recovery; fault tolerance; idempotent region; memory overhead; performance overhead; reliability enhancement exploiting technique; semiconductor technology; Circuit faults; Fault tolerance; Fault tolerant systems; Hardware; Program processors; Transient analysis; Fault tolerance; Idempotent property; Transient faults;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Computation (ICNC), 2013 Ninth International Conference on
Conference_Location
Shenyang
Type
conf
DOI
10.1109/ICNC.2013.6818241
Filename
6818241
Link To Document