DocumentCode
3447900
Title
Pair and swap: An approach to graceful degradation for dependable chip multiprocessors
Author
Imai, Masashi ; Nagai, Tomohide ; Nanya, Takashi
Author_Institution
Univ. of Tokyo, Tokyo, Japan
fYear
2010
fDate
June 28 2010-July 1 2010
Firstpage
119
Lastpage
124
Abstract
In this paper, we propose a processor-level fault tolerance technique called “Pair and Swap (P&S)” for a multi-core chip. In the P&S system, a 2n-cores-CMP (Chip Multiprocessor) which contains 2n processor cores composes n pairs. Two identical copies of a given task are executed on each pair of two processor cores and the results are compared repeatedly. If a fault is detected by a mismatch, partners of the mismatched pair are swapped with another pair and the mismatched task is re-executed from the latest checkpoint. Then, it is decided whether the fault is transient or permanent. If it is permanent, the faulty core is identified and isolated to reconfigure the entire system. P&S enables graceful degradation and tolerates both permanent and transient faults. We evaluate the performance of the proposed P&S and traditional triple module redundancy (TMR) using the Markov chains. The mean computation to failure of the P&S is about 1.4 times larger than that of dynamic TMR scheme.
Keywords
fault tolerant computing; microprocessor chips; multiprocessing systems; CMP; Markov chains; TMR; dependable chip multiprocessors; fault tolerance technique; faulty core; graceful degradation; triple module redundancy; Clocks; Degradation; Fabrication; Fault detection; Fault diagnosis; Fault tolerant systems; Multicore processing; Power dissipation; Redundancy; Very large scale integration;
fLanguage
English
Publisher
ieee
Conference_Titel
Dependable Systems and Networks Workshops (DSN-W), 2010 International Conference on
Conference_Location
Chicago, IL
Print_ISBN
978-1-4244-7729-6
Electronic_ISBN
978-1-4244-7728-9
Type
conf
DOI
10.1109/DSNW.2010.5542608
Filename
5542608
Link To Document