Title :
Strategies for Checkpoint Storage on Opportunistic Grids
Author :
De Camargo, Raphael Y. ; Kon, Fabio ; Cerqueira, Renato
Author_Institution :
Sao Paulo Univ.
Abstract :
This article evaluates several strategies for storing checkpoint data in an opportunistic grid environment, including replication, parity information, and erasure coding. We present a prototype implementation of a distributed checkpoint repository over InteGrade, a multiuniversity grid middleware project to leverage the computing power of idle shared workstations. Using this prototype, we performed several experiments to determine the trade-offs in these strategies between computational overhead, storage overhead, and degree of fault tolerance
Keywords :
checkpointing; fault tolerant computing; grid computing; middleware; storage management; data replication; distributed checkpoint data storage; erasure coding; fault tolerance; idle shared workstation; middleware; opportunistic grid environment; parity information; Art; Computational efficiency; Concurrent computing; Fault tolerance; Grid computing; Hardware; Information retrieval; Memory; Network servers; Prototypes; checkpointing; data coding; distributed storage; fault tolerance; grid computing;
Journal_Title :
Distributed Systems Online, IEEE
DOI :
10.1109/MDSO.2006.56