Title :
Towards Autonomic Virtual Applications in the In-VIGO System
Author :
Xu, Jing ; Adabala, Sumalatha ; Fortes, José A B
Author_Institution :
Dept. of Electr. & Comput. Eng., Florida Univ., Gainesville, FL
Abstract :
Grid environments enable users to share nondedicated resources that lack performance guarantees. This paper describes the design of application-centric middleware components to automatically recover from failures and dynamically adapt to grid environments with changing resource availabilities, improving fault-tolerance and performance. The key components of the application-centric approach are a global per-application execution history and an autonomic component that tracks the performance of a job on a grid resource against predictions based on the application execution history, to guide rescheduling decisions. Performance models of unmodified applications built using their execution history are used to predict failure as well as poor performance. A prototype of the proposed approach, an autonomic virtual application manager (AVAM), has been implemented in the context of the In-VIGO grid environment and its effectiveness has been evaluated for applications that generate CPU-intensive jobs with relatively short execution times (ranging from tens of seconds to less than an hour) on resources with highly variable loads - a workload generated by typical educational usage scenarios of In-VIGO-like grid environments. A memory-based learning algorithm is used to build the performance models for CPU-intensive applications that are used to predict the need for rescheduling. Results show that In-VIGO jobs managed by the AVAM consistently meet their execution deadlines under varying load conditions and gracefully recover from unexpected failures
Keywords :
automatic programming; fault tolerant computing; grid computing; learning (artificial intelligence); middleware; object-oriented programming; resource allocation; storage management; system recovery; CPU-intensive job; In-VIGO system; application execution history; application-centric middleware component; autonomic component; autonomic virtual application manager; autonomic virtual applications; dynamic adaptation; failure prediction; failure recover; fault tolerance; grid environments; grid resource; job rescheduling; memory-based learning algorithm; nondedicated resource sharing; performance guarantee; rescheduling decision; resource availability; system performance; Application software; Availability; Fault tolerance; Grid computing; History; Information systems; Mesh generation; Middleware; Predictive models; Quality of service;
Conference_Titel :
Autonomic Computing, 2005. ICAC 2005. Proceedings. Second International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7965-2276-9
DOI :
10.1109/ICAC.2005.62