• DocumentCode
    2254472
  • Title

    Intermittent Hardware Errors Recovery: Modeling and Evaluation

  • Author

    Rashid, Layali ; Pattabiraman, Karthik ; Gopalakrishnan, Sathish

  • Author_Institution
    Univ. of British Columbia, Vancouver, BC, Canada
  • fYear
    2012
  • fDate
    17-20 Sept. 2012
  • Firstpage
    220
  • Lastpage
    229
  • Abstract
    The frequency of hardware errors is increasing due to shrinking feature sizes, higher levels of integration, and increasing design complexity. Intermittent errors are those that occur non-deterministically at the same location. It has been shown that intermittent hardware errors contribute to about 39% of the total hardware failures. Intermittent faults have characteristics that are different than transient and permanent errors, which makes it challenging to devise efficient recovery techniques for them. In this paper, we evaluate the impact of different intermittent error recovery scenarios on the processor performance. To achieve this, we model a system that consists of a fault-tolerant multicore processor subject to intermittent faults. Our fault models are based on insights from related work at the physical level. We find that the frequency of the intermittent error and the relative importance of the error location play an important role in choosing the recovery action that maximizes the processor´s performance.
  • Keywords
    computational complexity; fault tolerant computing; multiprocessing systems; performance evaluation; system recovery; design complexity; fault-tolerant multicore processor; hardware errors recovery; hardware failures; intermittent errors; intermittent faults; Checkpointing; Circuit faults; Hardware; Logic gates; Microarchitecture; Transient analysis; Transistors; Intermittent hardware faults; fault model; recovery; stochastic activity network; transistor wearout;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Quantitative Evaluation of Systems (QEST), 2012 Ninth International Conference on
  • Conference_Location
    London
  • Print_ISBN
    978-1-4673-2346-8
  • Electronic_ISBN
    978-0-7695-4781-7
  • Type

    conf

  • DOI
    10.1109/QEST.2012.37
  • Filename
    6354654