Title :
The Wackamole approach to fault tolerant networks
Author :
Amir, Yair ; Caudy, Ryan ; Munjal, Ashima ; Schlossnagle, Theo ; Tutu, C.
Author_Institution :
Dept. of Comput. Sci., Johns Hopkins Univ., Baltimore, MD, USA
Abstract :
We present Wackamole, a high availability tool for clusters of servers. Wackamole ensures that a server handles the requests that arrive on any of the service´s public IP addresses. Wackamole is a completely distributed software solution based on a provably correct algorithm that negotiates the assignment of IP addresses among the available servers upon detection of faults and recoveries, and provides N-way fail-over, so that any one of a number of servers can cover for any other. Using a simple algorithm that utilizes strong group communication semantics, Wackamole demonstrates the application of group communication to address a critical availability problem at the core of the system, even in the presence of cascading network or server faults and recoveries. The same architecture is extended to provide a similar service for highly available routers.
Keywords :
Internet; computer network reliability; distributed algorithms; fault tolerant computing; local area networks; military communication; military computing; protocols; telecommunication security; DARPA; N-way fail-over; Wackamole; assignment negotiation; cascading faults; distributed software; fault tolerant networks; group communication semantics; high availability tool; highly available routers; provably correct algorithm; public IP addresses; server clusters; Availability; Clustering algorithms; Computer science; Fault tolerance; Hardware; Media Access Protocol; Network servers; Redundancy; Software algorithms; Switches;
Conference_Titel :
DARPA Information Survivability Conference and Exposition, 2003. Proceedings
Print_ISBN :
0-7695-1897-4
DOI :
10.1109/DISCEX.2003.1194920