Title :
Communication-induced determination of consistent snapshots
Author :
Hélary, Jean-Michel ; Mostefaoui, Achour ; Raynal, Michel
Author_Institution :
IRISA, Rennes, France
fDate :
9/1/1999 12:00:00 AM
Abstract :
A classical way to determine consistent snapshots consists in using Chandy-Lamport´s algorithm. This algorithm relies on specific control messages that allow processes to synchronize local checkpoint determination and message recording in order for the resulting snapshot to be consistent. This paper investigates a communication-induced approach to determine consistent snapshots. In such an approach, control information is carried out by application messages. Two abstract necessary and sufficient conditions are stated: one associated with global checkpoint consistency, the other associated with message recording. A general protocol is derived from these abstract conditions. Actually, this general protocol can be instantiated in distinct ways, giving rise to a family of communication-induced snapshot protocols. This general protocol shows there is an intrinsic trade-off between the number of forced checkpoints and the number of recorded messages. Finally, a particular instantiation of the general protocol is provided
Keywords :
message passing; protocols; synchronisation; system recovery; application messages; communication-induced snapshot protocols; consistent snapshots; control messages; global checkpoint consistency; local checkpoint determination; message recording; synchronization; Checkpointing; Communication system control; Distributed computing; Helium; Protocols; Sufficient conditions;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on