DocumentCode
1728605
Title
Job Migration and Fault Tolerance in SLA-Aware Resource Management Systems
Author
Battre, D. ; Hovestadt, Matthias ; Kao, Odej ; Keller, Axel ; Voss, Kerstin
Author_Institution
Tech. Univ. Berlin, Berlin
fYear
2008
Firstpage
43
Lastpage
48
Abstract
Contractually fixed service quality levels are mandatory prerequisites for attracting the commercial user to Grid environments. Service level agreements (SLAs) are powerful instruments for describing obligations and expectations in such a business relationship. At the level of local resource management systems, checkpointing and restart is an important instrument for realizing fault tolerance and SLA- awareness. This paper highlights the concepts of migrating such checkpoint datasets to achieve the goal of SLA- compliant job execution.
Keywords
grid computing; software fault tolerance; fault tolerance; job migration; resource management systems; service level agreements; Business; Checkpointing; Fault tolerance; Fault tolerant systems; Grid computing; Instruments; Middleware; Quality of service; Resource management; Risk management; Checkpointing; Fault Tolerance; Grid; Migration; RMS; Resource Management System; SLA; Service Level Agreement;
fLanguage
English
Publisher
ieee
Conference_Titel
Grid and Pervasive Computing Workshops, 2008. GPC Workshops '08. The 3rd International Conference on
Conference_Location
Kunming
Print_ISBN
978-0-7695-3177-9
Type
conf
DOI
10.1109/GPC.WORKSHOPS.2008.71
Filename
4539323
Link To Document