DocumentCode :
383344
Title :
Distributed checkpointing using synchronized clocks
Author :
Neogy, S. ; Sinha, A. ; Das, P.K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Jadavpur Univ., Kolkata, India
fYear :
2002
fDate :
2002
Firstpage :
199
Lastpage :
204
Abstract :
The processes of the distributed system considered in this paper use loosely synchronized clocks. The paper describes a method of taking checkpoints by such processes in a truly distributed manner, that is, in the absence of a global checkpoint coordinator. The constituent processes take checkpoints according to their own clocks at predetermined checkpoint instants. Since these checkpoints are asynchronous, so to determine a global consistent set of such checkpoints there must be some sort of synchronization among them. This is achieved by adding suitable information to the existing clock synchronization messages looking at which the processes synchronize their checkpoints to form a global consistent checkpoint. Communication in this system is synchronous, so, processes may be blocked for communication at the checkpointing instants. The blocked processes save the state in which they were just before being blocked. It is shown here that the set of such i-th checkpoints is consistent and hence the rollback required by the system in case failure occurs is only up to the last saved state.
Keywords :
synchronisation; system recovery; clock synchronization messages; distributed checkpointing; distributed system; global checkpoint coordinator; loosely synchronized clocks; rollback; synchronized clocks; Checkpointing; Clocks; Computer science; Electrical capacitance tomography; Fault tolerant systems; Message passing; Synchronization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Software and Applications Conference, 2002. COMPSAC 2002. Proceedings. 26th Annual International
ISSN :
0730-3157
Print_ISBN :
0-7695-1727-7
Type :
conf
DOI :
10.1109/CMPSAC.2002.1044553
Filename :
1044553
Link To Document :
بازگشت