DocumentCode
2747969
Title
The performance of coordinated and independent checkpointing
Author
Silva, Luis Moura ; Silva, João Gabriel
Author_Institution
Dept. de Engenharia Inf., Coimbra Univ., Portugal
fYear
1999
fDate
12-16 Apr 1999
Firstpage
280
Lastpage
284
Abstract
Checkpointing is a very effective technique to tolerate the occurrence of failures in distributed and parallel applications. The existing algorithms in the literature are basically divided into two main classes: coordinated and independent checkpointing. This paper presents an experimental study that compares the performance of these two classes of algorithms. The main conclusion of our study is that coordinated checkpointing is more efficient than independent checkpointing and all the arguments against the performance of coordinated algorithms were not verified in practice
Keywords
fault tolerant computing; performance evaluation; system recovery; checkpointing; coordinated checkpointing; distributed; fault tolerance; independent checkpointing; parallel; Bandwidth; Checkpointing; Electrical capacitance tomography; Parallel machines; Protocols; Runtime library; Testing;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing, 1999. 13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP. Proceedings
Conference_Location
San Juan
Print_ISBN
0-7695-0143-5
Type
conf
DOI
10.1109/IPPS.1999.760487
Filename
760487
Link To Document