• DocumentCode
    3447900
  • Title

    Pair and swap: An approach to graceful degradation for dependable chip multiprocessors

  • Author

    Imai, Masashi ; Nagai, Tomohide ; Nanya, Takashi

  • Author_Institution
    Univ. of Tokyo, Tokyo, Japan
  • fYear
    2010
  • fDate
    June 28 2010-July 1 2010
  • Firstpage
    119
  • Lastpage
    124
  • Abstract
    In this paper, we propose a processor-level fault tolerance technique called “Pair and Swap (P&S)” for a multi-core chip. In the P&S system, a 2n-cores-CMP (Chip Multiprocessor) which contains 2n processor cores composes n pairs. Two identical copies of a given task are executed on each pair of two processor cores and the results are compared repeatedly. If a fault is detected by a mismatch, partners of the mismatched pair are swapped with another pair and the mismatched task is re-executed from the latest checkpoint. Then, it is decided whether the fault is transient or permanent. If it is permanent, the faulty core is identified and isolated to reconfigure the entire system. P&S enables graceful degradation and tolerates both permanent and transient faults. We evaluate the performance of the proposed P&S and traditional triple module redundancy (TMR) using the Markov chains. The mean computation to failure of the P&S is about 1.4 times larger than that of dynamic TMR scheme.
  • Keywords
    fault tolerant computing; microprocessor chips; multiprocessing systems; CMP; Markov chains; TMR; dependable chip multiprocessors; fault tolerance technique; faulty core; graceful degradation; triple module redundancy; Clocks; Degradation; Fabrication; Fault detection; Fault diagnosis; Fault tolerant systems; Multicore processing; Power dissipation; Redundancy; Very large scale integration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Dependable Systems and Networks Workshops (DSN-W), 2010 International Conference on
  • Conference_Location
    Chicago, IL
  • Print_ISBN
    978-1-4244-7729-6
  • Electronic_ISBN
    978-1-4244-7728-9
  • Type

    conf

  • DOI
    10.1109/DSNW.2010.5542608
  • Filename
    5542608