DocumentCode
2234584
Title
Process-replication technique for fault-tolerance and performance improvement in distributed computing systems
Author
Chiu, Jane-Ferng ; Chiu, Ge-Ming
Author_Institution
Dept. of Electron. Eng. & Technol., Nat. Taiwan Inst. of Technol., Taipei, Taiwan
fYear
1994
fDate
2-5 Aug 1994
Firstpage
236
Lastpage
243
Abstract
The paper presents a process-replication protocol which aims at providing fault-tolerance as well as performance improvement to applications such as long-running and real-time tasks. Identical delivering order of messages are enforced on all replicas of a troupe using multicasts for inter- and intra-troupe communication. The detailed design of the protocol is given in the paper. The protocol is self-contained in the sense that crashes in a troupe are handled internally without affecting the operation of other troupes. The crash-handling procedure is simple and associated overhead during fail-free operation is small. The protocol takes advantages of the redundancy of processes to expedite the completion of a distributed task by speeding up the determination of message sequences and transmission of outgoing data messages at the expense of small control messages. Simulation is carried out to show the performance improvement
Keywords
distributed processing; fault tolerant computing; message passing; performance evaluation; protocols; software reliability; crash-handling procedure; data messages; distributed computing systems; fail-free operation; fault-tolerance; intertroupe communication; intratroupe communication; message sequences; multicasts; overhead; performance improvement; process-replication technique; protocol; real-time tasks; redundancy; simulation; small control messages; Computational modeling; Computer crashes; Content addressable storage; Distributed computing; Fault tolerance; Fault tolerant systems; History; Multicast protocols; Redundancy; Resumes;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Distributed Computing, 1994., Proceedings of the Third IEEE International Symposium on
Conference_Location
San Francisco, CA
Print_ISBN
0-8186-6395-2
Type
conf
DOI
10.1109/HPDC.1994.340239
Filename
340239
Link To Document