DocumentCode :
1829521
Title :
A checkpoint scheme with task duplication considering transient and permanent faults
Author :
Yang, J.-M. ; Kwak, S.W.
Author_Institution :
Dept. of Electr. Eng., Catholic Univ. of Daegu, Gyeongsan, South Korea
fYear :
2010
fDate :
7-10 Dec. 2010
Firstpage :
606
Lastpage :
610
Abstract :
Proposed here is a novel architecture for a fault-tolerant real-time system. We employ a checkpoint rollback strategy with double modular redundancy. Main consideration is given to how to recover from both transient and permanent faults without any built-in fault-detection modules or spare processors. Besides state comparison between duplicated tasks, the system has access to the state of the previous checkpoint so that the integrity of a processor can be checked. Using a Markov model capturing the behavior of the proposed scheme, we calculate the probability of task completion against faults that occur in a Poisson process. The optimal number of checkpoints is selected so as to maximize the probability of task completion.
Keywords :
Markov processes; checkpointing; fault tolerant computing; multiprocessing systems; probability; real-time systems; Markov model; Poisson process; checkpoint rollback strategy; double modular redundancy; fault-tolerant real-time system; permanent faults; processor integrity checking; task completion probability; task duplication; transient faults; Checkpointing; Fault tolerance; Fault tolerant systems; Markov processes; Program processors; Real time systems; Transient analysis; Checkpointing; Markov model; double modular redundancy (DMR); fault tolerance; real-time tasks;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Industrial Engineering and Engineering Management (IEEM), 2010 IEEE International Conference on
Conference_Location :
Macao
ISSN :
2157-3611
Print_ISBN :
978-1-4244-8501-7
Electronic_ISBN :
2157-3611
Type :
conf
DOI :
10.1109/IEEM.2010.5674520
Filename :
5674520
Link To Document :
بازگشت