DocumentCode
2859770
Title
Adaptive checkpointing in dynamic grids for uncertain job durations
Author
Chtepen, Maria ; Dhoedt, Bart ; De Turck, Filip ; Demeester, Piet ; Claeys, Filip H A ; Vanrolleghem, Peter A.
Author_Institution
INTEC-IBBT, Ghent Univ., Ghent, Belgium
fYear
2009
fDate
22-25 June 2009
Firstpage
585
Lastpage
590
Abstract
Adaptive checkpointing is a relatively new approach that is particularly suitable for providing fault-tolerance in dynamic and unstable grid environments. The approach allows for periodic modification of checkpointing intervals at run-time, when additional information becomes available. In this paper an adaptive algorithm, named MeanFailureCP+, is introduced that deals with checkpointing of grid applications with execution times that are unknown a priori. The algorithm modifies its parameters, based on dynamically collected feedback on its performance. Simulation results show that the new algorithm performs even better than adaptive approaches that make use of exact information on job execution times.
Keywords
grid computing; software fault tolerance; MeanFailureCP+; adaptive algorithm; adaptive checkpointing; fault-tolerance; grid computing; uncertain job duration; Adaptive algorithm; Checkpointing; Computational modeling; Computer networks; Fault tolerance; Feedback; Grid computing; Job design; Resource management; Runtime; Grid computing; adaptive checkpointing; fault-tolerance;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Technology Interfaces, 2009. ITI '09. Proceedings of the ITI 2009 31st International Conference on
Conference_Location
Dubrovnik
ISSN
1330-1012
Print_ISBN
978-953-7138-15-8
Type
conf
DOI
10.1109/ITI.2009.5196152
Filename
5196152
Link To Document