DocumentCode
2254472
Title
Intermittent Hardware Errors Recovery: Modeling and Evaluation
Author
Rashid, Layali ; Pattabiraman, Karthik ; Gopalakrishnan, Sathish
Author_Institution
Univ. of British Columbia, Vancouver, BC, Canada
fYear
2012
fDate
17-20 Sept. 2012
Firstpage
220
Lastpage
229
Abstract
The frequency of hardware errors is increasing due to shrinking feature sizes, higher levels of integration, and increasing design complexity. Intermittent errors are those that occur non-deterministically at the same location. It has been shown that intermittent hardware errors contribute to about 39% of the total hardware failures. Intermittent faults have characteristics that are different than transient and permanent errors, which makes it challenging to devise efficient recovery techniques for them. In this paper, we evaluate the impact of different intermittent error recovery scenarios on the processor performance. To achieve this, we model a system that consists of a fault-tolerant multicore processor subject to intermittent faults. Our fault models are based on insights from related work at the physical level. We find that the frequency of the intermittent error and the relative importance of the error location play an important role in choosing the recovery action that maximizes the processor´s performance.
Keywords
computational complexity; fault tolerant computing; multiprocessing systems; performance evaluation; system recovery; design complexity; fault-tolerant multicore processor; hardware errors recovery; hardware failures; intermittent errors; intermittent faults; Checkpointing; Circuit faults; Hardware; Logic gates; Microarchitecture; Transient analysis; Transistors; Intermittent hardware faults; fault model; recovery; stochastic activity network; transistor wearout;
fLanguage
English
Publisher
ieee
Conference_Titel
Quantitative Evaluation of Systems (QEST), 2012 Ninth International Conference on
Conference_Location
London
Print_ISBN
978-1-4673-2346-8
Electronic_ISBN
978-0-7695-4781-7
Type
conf
DOI
10.1109/QEST.2012.37
Filename
6354654
Link To Document