DocumentCode
2257155
Title
A communication-induced checkpointing algorithm using virtual checkpoint on distributed systems
Author
Do-Hyung, Kim ; Chang-Soon, Park
Author_Institution
Electron. & Telecommun. Res. Inst., South Korea
fYear
2000
fDate
2000
Firstpage
145
Lastpage
150
Abstract
Checkpointing is a fault-tolerant technique for restoring faults and restarting jobs quickly. The algorithms for checkpointing on distributed systems have been under study for years. These algorithms can be classified into three types: coordinated, uncoordinated and communication-induced algorithms. In this paper we propose a new communication-induced checkpointing algorithm that has a minimum checkpointing count equivalent to the periodic checkpointing algorithm, and relatively short rollback distance at fault situations. The proposed algorithm is compared with the previously proposed communication-induced checkpointing algorithms with simulation results. In the simulation, the proposed algorithm produces better performance than other algorithms in terms of task completion time in both fault-free and fault situations
Keywords
distributed processing; fault tolerant computing; system recovery; virtual machines; communication-induced checkpointing algorithm; distributed systems; rollback distance; simulation; task completion time; virtual checkpoint; Checkpointing; Communication system control; Degradation; Fault tolerant systems; Force control; Hardware; Terminology;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Systems, 2000. Proceedings. Seventh International Conference on
Conference_Location
Iwate
ISSN
1521-9097
Print_ISBN
0-7695-0568-6
Type
conf
DOI
10.1109/ICPADS.2000.857693
Filename
857693
Link To Document