• DocumentCode
    1745491
  • Title

    Designing a service of failure detection in asynchronous distributed systems

  • Author

    Baldoni, Roberto ; Zito, Fabio

  • Author_Institution
    Dipartimento di Inf. e Sistemistica, Rome Univ., Italy
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    113
  • Lastpage
    120
  • Abstract
    Even though introduced for solving the consensus problem in asynchronous distributed systems, the notion of unreliable failure detector can be used as a powerful tool for any distributed protocol in order to get better performance by allowing the usage of aggressive time-outs to detect failures of entities executing the protocol. We present the design of a Failure Detection Service (FDS) based on the notion of unreliable failure detectors introduced by T. Chandra and S. Toueg (1996). FDS is able to detect crashed objects and entities that permanently omit to send messages without imposing changes to the source code of the underlying protocols that use this service. Also, FDS provides an object oriented interface to its subscribers and, more important, it does not add network overhead if no entity subscribes to the service. The paper can be also seen as a first step towards a distributed implementation of a heartbeat-based failure management system as defined in fault-tolerant CORBA specification
  • Keywords
    distributed object management; distributed processing; object-oriented programming; protocols; software fault tolerance; user interfaces; FDS; Failure Detection Service; aggressive time-outs; asynchronous distributed systems; consensus problem; crashed objects; distributed implementation; distributed protocol; failure detection; fault-tolerant CORBA specification; heartbeat-based failure management system; object oriented interface; source code; unreliable failure detector; unreliable failure detectors; Buildings; Computer crashes; Delay; Detectors; Fault detection; Fault tolerance; Fault tolerant systems; Object detection; Protocols; Remuneration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Object-Oriented Real-Time Distributed Computing, 2001. ISORC - 2001. Proceedings. Fourth IEEE International Symposium on
  • Conference_Location
    Magdeburg
  • Print_ISBN
    0-7695-1089-2
  • Type

    conf

  • DOI
    10.1109/ISORC.2001.922825
  • Filename
    922825