DocumentCode
383344
Title
Distributed checkpointing using synchronized clocks
Author
Neogy, S. ; Sinha, A. ; Das, P.K.
Author_Institution
Dept. of Comput. Sci. & Eng., Jadavpur Univ., Kolkata, India
fYear
2002
fDate
2002
Firstpage
199
Lastpage
204
Abstract
The processes of the distributed system considered in this paper use loosely synchronized clocks. The paper describes a method of taking checkpoints by such processes in a truly distributed manner, that is, in the absence of a global checkpoint coordinator. The constituent processes take checkpoints according to their own clocks at predetermined checkpoint instants. Since these checkpoints are asynchronous, so to determine a global consistent set of such checkpoints there must be some sort of synchronization among them. This is achieved by adding suitable information to the existing clock synchronization messages looking at which the processes synchronize their checkpoints to form a global consistent checkpoint. Communication in this system is synchronous, so, processes may be blocked for communication at the checkpointing instants. The blocked processes save the state in which they were just before being blocked. It is shown here that the set of such i-th checkpoints is consistent and hence the rollback required by the system in case failure occurs is only up to the last saved state.
Keywords
synchronisation; system recovery; clock synchronization messages; distributed checkpointing; distributed system; global checkpoint coordinator; loosely synchronized clocks; rollback; synchronized clocks; Checkpointing; Clocks; Computer science; Electrical capacitance tomography; Fault tolerant systems; Message passing; Synchronization;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Software and Applications Conference, 2002. COMPSAC 2002. Proceedings. 26th Annual International
ISSN
0730-3157
Print_ISBN
0-7695-1727-7
Type
conf
DOI
10.1109/CMPSAC.2002.1044553
Filename
1044553
Link To Document