DocumentCode
287632
Title
Efficient and fault-tolerant checkpointing procedures for distributed systems
Author
Saleh, Kassem ; Agarwal, Anjali
Author_Institution
Dept. of Electr. & Comput. Eng., Kuwait Univ., Kuwait
fYear
1993
fDate
23-26 Mar 1993
Firstpage
161
Lastpage
167
Abstract
Problems related to distributed systems fault-tolerance are tackled by providing efficient and fault-tolerant algorithm procedures for checkpointing and rollback recovery for such systems. The authors propose checkpointing algorithms which can be initiated by any process in the system or upon failure of one or more component processes as part of a backward recovery procedure. The algorithm return the most recent and consistent checkpoints, require less stable storage and do not interfere with the progress of the distributed system application. Obtaining a consistent checkpoint is always guaranteed. Examples illustrating these algorithms are also provided
Keywords
distributed databases; fault tolerant computing; backward recovery procedure; distributed systems; fault-tolerant algorithm procedures; fault-tolerant checkpointing procedures; rollback recovery; Checkpointing; Delay; Distributed algorithms; Distributed computing; Fault tolerant systems; Joining processes; Law; Legal factors; Resumes; System recovery;
fLanguage
English
Publisher
ieee
Conference_Titel
Computers and Communications, 1993., Twelfth Annual International Phoenix Conference on
Conference_Location
Tempe, AZ
Print_ISBN
0-7803-0922-7
Type
conf
DOI
10.1109/PCCC.1993.344469
Filename
344469
Link To Document