Title :
Maximal global snapshot with concurrent initiators
Author :
Prakash, Ravi ; Singhal, Mukesh
Author_Institution :
Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH, USA
Abstract :
In a distributed system multiple nodes may initiate snapshot collection concurrently. In this paper we present a global snapshot collection algorithm that combines the information collected by each initiator. This generates a maximal, consistent global snapshot that is more recent than the snapshot collected by any initiator. Global snapshots are used to establish checkpoints for recovery from node failures. A maximal snapshot implies that the amount of computation lost during roll-back after node failures, is minimized. We also present an efficient information dissemination strategy that nodes can employ to exchange snapshot information with each other
Keywords :
concurrency control; distributed processing; system monitoring; system recovery; concurrent initiators; concurrent snapshot; consistent snapshot; distributed system; global snapshot; global snapshot collection; node failures; recovery; Bidirectional control; Clocks; Concurrent computing; Distributed computing; Information science; Joining processes; Resumes; Synchronization;
Conference_Titel :
Parallel and Distributed Processing, 1994. Proceedings. Sixth IEEE Symposium on
Conference_Location :
Dallas, TX
Print_ISBN :
0-8186-6427-4
DOI :
10.1109/SPDP.1994.346149