DocumentCode
1957333
Title
Runtime Fault-Handling for Job-Flow Management in Grid Environments
Author
Dasgupta, Gargi ; Ezenwoye, Onyeka ; Fong, Liana ; Kalayci, Selim ; Sadjadi, S. Masoud ; Viswanathan, Balaji
Author_Institution
IBM India Res. Lab., New Delhi
fYear
2008
fDate
2-6 June 2008
Firstpage
201
Lastpage
202
Abstract
The execution of job flow applications is a reality today in academic and industrial domains. In this paper, we propose an approach to adding self-healing behavior to the execution of job flows without the need to modify the job flow engines or redevelop the job flows themselves. We show the feasibility of our non-intrusive approach to self-healing by inserting a generic proxy to an existing two-level job-flow management system, which employs job flow based service orchestration at the upper level, and service choreography at the lower level. The generic proxy is inserted transparently between these two layers so that it can intercept all their interactions. We developed a prototype of our approach in a real Grid environment to show how the proxy facilitates runtime handling for failure recovery.
Keywords
fault tolerant computing; grid computing; failure recovery; generic proxy; grid environments; job flow based service orchestration; job flow engines; job flow execution; runtime fault-handling; self-healing behavior; service choreography; two-level job-flow management system; Computer architecture; Environmental management; Fault tolerance; Grid computing; Job shop scheduling; Logic; Portals; Processor scheduling; Resource management; Runtime environment; fault-tolerance; generic proxy; job-flow management; job-flows; meta-scheduler;
fLanguage
English
Publisher
ieee
Conference_Titel
Autonomic Computing, 2008. ICAC '08. International Conference on
Conference_Location
Chicago, IL
Print_ISBN
978-0-7695-3175-5
Electronic_ISBN
978-0-7695-3175-5
Type
conf
DOI
10.1109/ICAC.2008.16
Filename
4550843
Link To Document