Title :
Crash fault detection in celerating environments
Author :
Sastry, Srikanth ; Pike, Scott M. ; Welch, Jennifer L.
Author_Institution :
Dept. of Comput. Sci. & Eng., Texas A&M Univ., College Station, TX, USA
Abstract :
Failure detectors are a service that provides (approximate) information about process crashes in a distributed system. The well-known ldquoeventually perfectrdquo failure detector, diamP, has been implemented in partially synchronous systems with unknown upper bounds on message delay and relative process speeds. However, previous implementations have overlooked an important subtlety with respect to measuring the passage of time in ldquoceleratingrdquo environments, in which absolute process speeds can continually increase or decrease while maintaining bounds on relative process speeds. Existing implementations either use action clocks, which fail in accelerating environments, or use real-time clocks, which fail in decelerating environments. We propose the use of bichronal clocks, which are a composition of action clocks and real-time clocks. Our solution can be readily adopted to make existing implementations of diamP robust to process celeration, which can result from hardware upgrades, server overloads, denial-of-service attacks, and other system volatilities.
Keywords :
distributed processing; software fault tolerance; bichronal clock; celerating environment; crash fault detection; denial-of-service attack; distributed system; message delay; synchronous system; Acceleration; Clocks; Computer crashes; Delay; Detectors; Fault detection; Robustness; Time measurement; Upper bound; Velocity measurement;
Conference_Titel :
Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
Conference_Location :
Rome
Print_ISBN :
978-1-4244-3751-1
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2009.5161050