DocumentCode :
1745491
Title :
Designing a service of failure detection in asynchronous distributed systems
Author :
Baldoni, Roberto ; Zito, Fabio
Author_Institution :
Dipartimento di Inf. e Sistemistica, Rome Univ., Italy
fYear :
2001
fDate :
2001
Firstpage :
113
Lastpage :
120
Abstract :
Even though introduced for solving the consensus problem in asynchronous distributed systems, the notion of unreliable failure detector can be used as a powerful tool for any distributed protocol in order to get better performance by allowing the usage of aggressive time-outs to detect failures of entities executing the protocol. We present the design of a Failure Detection Service (FDS) based on the notion of unreliable failure detectors introduced by T. Chandra and S. Toueg (1996). FDS is able to detect crashed objects and entities that permanently omit to send messages without imposing changes to the source code of the underlying protocols that use this service. Also, FDS provides an object oriented interface to its subscribers and, more important, it does not add network overhead if no entity subscribes to the service. The paper can be also seen as a first step towards a distributed implementation of a heartbeat-based failure management system as defined in fault-tolerant CORBA specification
Keywords :
distributed object management; distributed processing; object-oriented programming; protocols; software fault tolerance; user interfaces; FDS; Failure Detection Service; aggressive time-outs; asynchronous distributed systems; consensus problem; crashed objects; distributed implementation; distributed protocol; failure detection; fault-tolerant CORBA specification; heartbeat-based failure management system; object oriented interface; source code; unreliable failure detector; unreliable failure detectors; Buildings; Computer crashes; Delay; Detectors; Fault detection; Fault tolerance; Fault tolerant systems; Object detection; Protocols; Remuneration;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Object-Oriented Real-Time Distributed Computing, 2001. ISORC - 2001. Proceedings. Fourth IEEE International Symposium on
Conference_Location :
Magdeburg
Print_ISBN :
0-7695-1089-2
Type :
conf
DOI :
10.1109/ISORC.2001.922825
Filename :
922825
Link To Document :
بازگشت