DocumentCode :
1853421
Title :
Accelerated heartbeat protocols
Author :
Gouda, Mohamed G. ; McGuire, Tommy M.
Author_Institution :
Dept. of Comput. Sci., Texas Univ., Austin, TX, USA
fYear :
1998
fDate :
26-29 May 1998
Firstpage :
202
Lastpage :
209
Abstract :
Heartbeat protocols are used by distributed programs to ensure that if a process in a program terminates or fails, then the remaining processes in the program terminate. We present a class of heartbeat protocols that tolerate message loss. In these protocols, a root process periodically sends a beat message to every other process then waits to receive a reply beat message from every other process. If the root process does not receive a reply (possibly due to message loss), the root process reduces by half the period for sending beat messages. We show that in practical situations, the parameters of these protocols can be chosen to achieve a good compromise between three contradictory objectives: reduce the rate of sending beat messages, reduce the detection delay, and still keep the probability of premature termination small
Keywords :
computer network reliability; message passing; protocols; software fault tolerance; beat message; detection delay; distributed programs; fault tolerance; heartbeat protocols; message loss; process termination; program termination; Acceleration; Computer networks; Delay; Detection algorithms; Fault detection; Heart beat; Heart rate detection; Protocols; Read only memory;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Distributed Computing Systems, 1998. Proceedings. 18th International Conference on
Conference_Location :
Amsterdam
ISSN :
1063-6927
Print_ISBN :
0-8186-8292-2
Type :
conf
DOI :
10.1109/ICDCS.1998.679503
Filename :
679503
Link To Document :
بازگشت