DocumentCode :
800404
Title :
Fast asynchronous uniform consensus in real-time distributed systems
Author :
Hermant, Jean-François ; Le Lann, Gérard
Author_Institution :
Inst. Nat. de Recherche en Inf. et Autom., Le Chesnay, France
Volume :
51
Issue :
8
fYear :
2002
fDate :
8/1/2002 12:00:00 AM
Firstpage :
931
Lastpage :
944
Abstract :
We investigate whether asynchronous computational models and asynchronous algorithms can be considered for designing real-time distributed fault-tolerant systems. A priori, the lack of bounded finite delays is antagonistic with timeliness requirements. We show how to circumvent this apparent contradiction, via the principle of "late binding" of a solution to some (partially) synchronous model. This principle is shown to maximize the coverage of demonstrated safety, liveness, and timeliness properties. These general results are illustrated with the uniform consensus (UC) and real-time UC problems, assuming processor crashes and reliable communications, considering asynchronous solutions based upon unreliable failure detectors. We introduce the concept of fast failure detectors and show that the problem of building strong or perfect fast failure detectors in real systems can be stated as a distributed message scheduling problem. A generic solution to this problem is given and illustrated considering deterministic Ethernets. In passing, it is shown that, with our construction of unreliable failure detectors, asynchronous algorithms that solve UC have a worst-case termination lower bound that matches the optimal synchronous lower bound. Finally, we introduce FastUC, a novel solution to UC, that is based upon fast failure detectors.
Keywords :
distributed processing; fault diagnosis; fault tolerant computing; processor scheduling; real-time systems; synchronisation; asynchronous computational models; distributed fault-tolerant computing; liveness; partially synchronous computational models; real time systems; safety; schedulability; timeliness; uniform consensus; unreliable failure detectors; Algorithm design and analysis; Buildings; Computational modeling; Computer crashes; Delay; Detectors; Distributed computing; Fault tolerant systems; Real time systems; Safety;
fLanguage :
English
Journal_Title :
Computers, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9340
Type :
jour
DOI :
10.1109/TC.2002.1024740
Filename :
1024740
Link To Document :
بازگشت