Title :
Process-replication technique for fault-tolerance and performance improvement in distributed computing systems
Author :
Chiu, Jane-Ferng ; Chiu, Ge-Ming
Author_Institution :
Dept. of Electron. Eng. & Technol., Nat. Taiwan Inst. of Technol., Taipei, Taiwan
Abstract :
The paper presents a process-replication protocol which aims at providing fault-tolerance as well as performance improvement to applications such as long-running and real-time tasks. Identical delivering order of messages are enforced on all replicas of a troupe using multicasts for inter- and intra-troupe communication. The detailed design of the protocol is given in the paper. The protocol is self-contained in the sense that crashes in a troupe are handled internally without affecting the operation of other troupes. The crash-handling procedure is simple and associated overhead during fail-free operation is small. The protocol takes advantages of the redundancy of processes to expedite the completion of a distributed task by speeding up the determination of message sequences and transmission of outgoing data messages at the expense of small control messages. Simulation is carried out to show the performance improvement
Keywords :
distributed processing; fault tolerant computing; message passing; performance evaluation; protocols; software reliability; crash-handling procedure; data messages; distributed computing systems; fail-free operation; fault-tolerance; intertroupe communication; intratroupe communication; message sequences; multicasts; overhead; performance improvement; process-replication technique; protocol; real-time tasks; redundancy; simulation; small control messages; Computational modeling; Computer crashes; Content addressable storage; Distributed computing; Fault tolerance; Fault tolerant systems; History; Multicast protocols; Redundancy; Resumes;
Conference_Titel :
High Performance Distributed Computing, 1994., Proceedings of the Third IEEE International Symposium on
Conference_Location :
San Francisco, CA
Print_ISBN :
0-8186-6395-2
DOI :
10.1109/HPDC.1994.340239