Author :
Moser, L.E. ; Melliar-Smith, P.M. ; Agarwal, D.A. ; Budhia, R.K. ; Lingley-Papadopoulos, C.A. ; Archambault, T.P.
Author_Institution :
Dept. of Electr. & Comput. Eng., California Univ., Santa Barbara, CA, USA
Abstract :
The Totem system supports fault-tolerant applications in which distributed processes cooperate to perform a common task and in which replicated data must be updated consistently in the presence of asynchrony and faults. Reliable totally ordered delivery of messages to processes within process groups is provided on a single local-area network or over multiple local-area networks interconnected by gateways. Message ordering is consistent across the entire network, despite processor and communication faults, without requiring all processes to deliver all messages. The Totem system handles processor failure and recovery, as well as network partitioning and remerging, and provides membership and topology maintenance services.<>
Keywords :
LAN interconnection; local area networks; message passing; software fault tolerance; system recovery; Totem system; asynchrony; common task; communication faults; cooperating distributed processes; fault-tolerant applications; faults; gateway-interconnected networks; membership maintenance services; message ordering; multiple local-area networks; network partitioning; network remerging; process groups; processor failure; processor faults; processor recovery; reliable totally ordered message delivery; replicated data updating; single local-area network; topology maintenance services; Automatic control; Computer network reliability; Fault detection; Fault tolerance; Fault tolerant systems; Local area networks; Maintenance; Multicast protocols; Network topology; Telecommunication network reliability;
Conference_Titel :
Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on
Conference_Location :
Pasadena, CA, USA
Print_ISBN :
0-8186-7079-7
DOI :
10.1109/FTCS.1995.466998