Title :
On Improving the Reliability of Internet Services through Active Replication
Author :
Ayari, Narjess ; Barbaron, Denis ; Lefèvre, Laurent
Author_Institution :
France Telecom R&D, Lannion
Abstract :
Fault tolerance can be defined as the capability of a system or a component to continue normal operation, despite the occurrence of a hardware or a software fault. Frameworks providing a fault tolerant service take advantage of resources´ redundancy to provide high availability capabilities. One of the critical challenges that led to this work is the observation that the existing fault tolerance frameworks are not adapted to current and next generation Internet services. Indeed, they do not provide consistent service-aware failure recovery capabilities. Particularly, little interest has been granted to transport level awareness despite its important partaking in improving the reliability of connection oriented services and stateful devices. In this paper, we evaluate an active replication based framework for highly available Internet services. The proposed framework is fully client/server transparent. Performance evaluations show that it incurs minimal overhead to end-to-end conversations during failsafe periods and performs well during failures.
Keywords :
Internet; software performance evaluation; software reliability; Internet service reliability; active replication; connection oriented services; fault tolerant service; performance evaluations; Availability; Fault tolerance; Fault tolerant systems; Hardware; Network servers; Performance evaluation; Redundancy; Robustness; Web and internet services; Web server; High availability; connection oriented services; reliability; session level awareness; stateful devices.; transport level awareness;
Conference_Titel :
Parallel and Distributed Computing, Applications and Technologies, 2008. PDCAT 2008. Ninth International Conference on
Conference_Location :
Otago
Print_ISBN :
978-0-7695-3443-5
DOI :
10.1109/PDCAT.2008.82