Title :
Performance comparison of a rotating coordinator and a leader based consensus algorithm
Author :
Urban, Patricia ; Hayashibara, Naohiro ; Schiper, Andre ; Katayama, Takuya
Author_Institution :
Japan Adv. Inst. of Sci. & Technol., Ishikawa, Japan
Abstract :
Protocols that solve agreement problems are essential building blocks for fault tolerant distributed systems. While many protocols have been published, little has been done to analyze their performance, especially the performance of their fault tolerance mechanisms. In this paper, we compare two well-known asynchronous consensus algorithms. In both algorithms, a leader process tries to impose a decision, and another leader retries if the leader fails doing so. The algorithms elect leaders differently: the Chandra-Toueg algorithm has a rotating leader, whereas processes in the Paxos algorithm elect leaders directly. We investigate the performance implications of this difference. In the system under study, processes send atomic broadcasts to each other. Consensus is used to decide the delivery order of messages. We evaluate the steady state latency in (1) runs with neither crashes nor suspicions, (2) runs with crashes and (3) runs with no crashes in which correct processes are wrongly suspected to have crashed, as well as the transient latency after (4) one crash and (5) multiple correlated crashes. The results show that the Paxos algorithm tolerates frequent wrong suspicions (3) and correlated crashes (5) better, while the performance is comparable in all other scenarios.
Keywords :
distributed processing; fault tolerant computing; performance evaluation; protocols; Chandra-Toueg algorithm; Paxos algorithm; agreement problem solving; asynchronous algorithm; atomic broadcast; correlated crash; failure detector; failure simulation; fault tolerant distributed system; leader based consensus algorithm; message delivery order; performance comparison; protocol performance; rotating coordinator; steady state latency; transient latency; Algorithm design and analysis; Broadcasting; Computer crashes; Delay; Fault tolerance; Fault tolerant systems; Performance analysis; Protocols; Safety; Steady-state;
Conference_Titel :
Reliable Distributed Systems, 2004. Proceedings of the 23rd IEEE International Symposium on
Print_ISBN :
0-7695-2239-4
DOI :
10.1109/RELDIS.2004.1352999