DocumentCode
1435416
Title
Improving the Robustness of Distributed Failure Detectors in Adverse Conditions
Author
Lemos, F.T.C. ; Sato, L.M.
Author_Institution
Univ. de Sao Paulo (USP), Sao Paulo, Brazil
Volume
10
Issue
1
fYear
2012
Firstpage
1364
Lastpage
1369
Abstract
Failure detection is at the core of most fault tolerance strategies, but it often depends on reliable communication. We present new algorithms for failure detectors which are appropriate as components of a fault tolerance system that can be deployed in situations of adverse network conditions (such as loosely connected and administered computing grids). It packs redundancy into heartbeat messages, thereby improving on the robustness of the traditional protocols. Results from experimental tests conducted in a simulated environment with adverse network conditions show significant improvement over existing solutions.
Keywords
protocols; telecommunication network reliability; adverse network conditions; distributed failure detectors; fault tolerance strategies; heartbeat messages; protocols; reliable communication; Biomedical monitoring; Detectors; Fault tolerance; Heart beat; Monitoring; Payloads; Robustness; Distributed Failure Detectors; Failure Detection; Fault Tolerance;
fLanguage
English
Journal_Title
Latin America Transactions, IEEE (Revista IEEE America Latina)
Publisher
ieee
ISSN
1548-0992
Type
jour
DOI
10.1109/TLA.2012.6142485
Filename
6142485
Link To Document