• DocumentCode
    2840105
  • Title

    Failure Detectors in Homonymous Distributed Systems (with an Application to Consensus)

  • Author

    Arevalo, Sergio ; Anta, Antonio Fernandez ; Imbs, Damien ; Jimenez, Ernesto ; Raynal, Michel

  • Author_Institution
    EUI, Univ. Politec. de Madrid, Madrid, Spain
  • fYear
    2012
  • fDate
    18-21 June 2012
  • Firstpage
    275
  • Lastpage
    284
  • Abstract
    This paper is on homonymous distributed systems where processes are prone to crash failures and have no initial knowledge of the system membership (“homonymous” means that several processes may have the same identifier). New classes of failure detectors suited to these systems are first defined. Among them, the classes HΩ and HΣ are introduced that are the homonymous counterparts of the classes Ω and Σ, respectively. (Recall that the pair 〈Ω, Σ〉 defines the weakest failure detector to solve consensus.) Then, the paper shows how HΩ and HΣ can be implemented in homonymous systems without membership knowledge (under different synchrony requirements). Finally, two algorithms are presented that use these failure detectors to solve consensus in homonymous asynchronous systems where there is no initial knowledge of the membership. One algorithm solves consensus with 〈HΩ, HΣ〉, while the other uses only HΩ, but needs a majority of correct processes. Observe that the systems with unique identifiers and anonymous systems are extreme cases of homonymous systems from which follows that all these results also apply to these systems. Interestingly, the new failure detector class HΩ can be implemented with partial synchrony, while the analogous class AΩ defined for anonymous systems can not be implemented (even in synchronous systems). Hence, the paper provides us with the first proof showing that consensus can be solved in anonymous systems with only partial synchrony (and a majority of correct processes).
  • Keywords
    fault tolerant computing; message passing; system recovery; anonymous systems; consensus; crash failure; crash-prone message-passing distributed system; failure detectors; homonymous asynchronous system; homonymous distributed system; membership knowledge; partial synchrony; process failure; synchrony requirement; system membership; unique identifiers; Computer crashes; Context; Detectors; Distributed computing; Face; Nominations and elections; Safety; Agreement problem; Asynchrony; Consensus; Distributed computability; Homonymous system; Message-passing; Process crash; failure detector;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing Systems (ICDCS), 2012 IEEE 32nd International Conference on
  • Conference_Location
    Macau
  • ISSN
    1063-6927
  • Print_ISBN
    978-1-4577-0295-2
  • Type

    conf

  • DOI
    10.1109/ICDCS.2012.13
  • Filename
    6258000