DocumentCode :
2787568
Title :
SymptomTM: Symptom-Based Error Detection and Recovery Using Hardware Transactional Memory
Author :
Yalcin, Gulay ; Unsal, Osman S. ; Cristal, Adrian ; Hur, Ibrahim ; Valero, Mateo
Author_Institution :
Artificial Intell. Res. Inst., Spanish Nat. Res. Council, Spain
fYear :
2011
fDate :
10-14 Oct. 2011
Firstpage :
199
Lastpage :
200
Abstract :
Fault-tolerance has become an essential concern for processor designers due to increasing transient and permanent fault rates. In this study we propose Symptom TM, a symptom-based error detection technique that recovers from errors by leveraging the abort mechanism of Transactional Memory (TM). To the best of our knowledge, this is the first architectural fault-tolerance proposal using Hardware Transactional Memory (HTM). Symptom TM can recover from 86% and 65% of catastrophic failures caused by transient and permanent errors respectively with no performance overhead in error-free executions.
Keywords :
fault tolerant computing; system recovery; transaction processing; SymptomTM; architectural fault tolerance; hardware transactional memory; permanent fault rate; processor designer; symptom-based error detection; symptom-based error recovery; transient fault rate; Fault tolerance; Fault tolerant systems; Hardware; Monitoring; Proposals; Transient analysis; Fault Tolerance; Hardware Transactional Memory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Architectures and Compilation Techniques (PACT), 2011 International Conference on
Conference_Location :
Galveston, TX
ISSN :
1089-795X
Print_ISBN :
978-1-4577-1794-9
Type :
conf
DOI :
10.1109/PACT.2011.39
Filename :
6113814
Link To Document :
بازگشت