• DocumentCode
    1238727
  • Title

    Autonomic microprocessor execution via self-repairing arrays

  • Author

    Bower, Fred A. ; Ozev, Sule ; Sorin, Daniel J.

  • Author_Institution
    Dept. of Comput. Sci., Duke Univ., Durham, NC, USA
  • Volume
    2
  • Issue
    4
  • fYear
    2005
  • Firstpage
    297
  • Lastpage
    310
  • Abstract
    To achieve high reliability despite hard faults that occur during operation and to achieve high yield despite defects introduced at fabrication, a microprocessor must be able to tolerate hard faults. In this paper, we present a framework for autonomic self-repair of the array structures in microprocessors (e.g., reorder buffer, instruction window, etc.). The framework consists of three aspects: 1) detecting/diagnosing the fault, 2) recovering from the resultant error, and 3) mapping out the faulty portion of the array. For each aspect, we present design options. Based on this framework, we develop two particular schemes for self-repairing array structures (SRAS). Simulation results show that one of our SRAS schemes adds some performance overhead in the fault-free case, but that both of them mask hard faults 1) with less hardware overhead cost than higher-level redundancy (e.g., IBM mainframes) and 2) without the per-error performance penalty of existing low-cost techniques that combine error detection with pipeline flushes for backward error recovery (BER). When hard faults are present in arrays, due to operational faults or fabrication defects, SRAS schemes outperform BER due to not having to frequently flush the pipeline.
  • Keywords
    fault diagnosis; fault tolerant computing; integrated circuit reliability; logic arrays; logic design; logic testing; microprocessor chips; system recovery; autonomic microprocessor execution; error recovery; fault detection; fault diagnosis; self-repairing array structures; Bit error rate; Computer errors; Costs; Fabrication; Fault detection; Hardware; Logic; Microprocessors; Pipelines; Testing; Index Terms- Logic design reliability and testing; and microcomputers.; microprocessors;
  • fLanguage
    English
  • Journal_Title
    Dependable and Secure Computing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5971
  • Type

    jour

  • DOI
    10.1109/TDSC.2005.44
  • Filename
    1542052