Title :
Adaptive Failover for Real-Time Middleware with Passive Replication
Author :
Balasubramanian, Jaiganesh ; Tambe, Sumant ; Lu, Chenyang ; Gokhale, Aniruddha ; Gill, Christopher ; Schmidt, Douglas C.
Author_Institution :
Dept. of EECS, Vanderbilt Univ., Nashville, TN
Abstract :
Supporting uninterrupted services for distributed soft real-time applications is hard in resource-constrained and dynamic environments, where processor or process failures and system workload changes are common. Fault-tolerant middleware for these applications must achieve high service availability and satisfactory response times for client applications. Although passive replication is a promising fault tolerance strategy for resource-constrained systems, conventional client failover approaches are non-adaptive and load-agnostic, which can cause system overloads and significantly increase response times after failure recovery.This paper presents four contributions to the study of passive replication for distributed soft real-time applications. First, it describes how our fault-tolerant load-aware and adaptive middleware (FLARe) dynamically adjusts failover targets at runtime in response to system load fluctuations and resource availability. Second, it describes how FLARe´s overload management strategy proactively enforces desired CPU utilization bounds by redirecting clients from overloaded processors. Third, it presents the design and implementation of FLARe´s lightweight middleware architecture that manages failures and overloads transparently to clients. Finally, it presents experimental results on a distributed Linux testbed that demonstrate how FLARe adaptively maintains soft real-time performance for clients operating in the presence of failures and overloads with negligible runtime overhead.
Keywords :
Linux; middleware; software fault tolerance; software performance evaluation; adaptive failover; adaptive middleware; distributed Linux testbed; fault-tolerant middleware; overloaded processors; passive replication; real-time middleware; resource-constrained systems; Availability; Delay; Fault tolerance; Fault tolerant systems; Fluctuations; Linux; Middleware; Real time systems; Runtime; Testing; adaptive; fault tolerance; load-aware; passive replication; real-time middleware;
Conference_Titel :
Real-Time and Embedded Technology and Applications Symposium, 2009. RTAS 2009. 15th IEEE
Conference_Location :
San Francisco, CA
Print_ISBN :
978-0-7695-3636-1
DOI :
10.1109/RTAS.2009.36