DocumentCode :
3179318
Title :
A flexible clustered approach to high availability
Author :
Hughes-Fenchel, G.
Author_Institution :
Lucent Technol., Murray Hill, NJ, USA
fYear :
1997
fDate :
24-27 June 1997
Firstpage :
314
Lastpage :
318
Abstract :
The Reliable Clustered Computing project created a system which enables applications to improve the reliability of off the shelf computers from a typical 99% (about 90 hours of downtime per year) to 99.99% (under one hour of downtime per year) in a cost-effective manner. The chief constraints were the need to achieve high reliability while minimizing cost and maintaining vendor independence. This was realized by creating a vendor independent clustered configuration comprised of two or more computers capable of recovering from hardware or software errors by restarting one or more processes on the current machine or by failing over one or more processes to another machine. Only two inexpensive custom hardware components were required for this solution: a WatchDog, to monitor component status, and a PowerDog, to control electrical power to processing elements (and optional peripherals). The bulk of the functionality was provided by software.
Keywords :
fault tolerant computing; reliability; system recovery; PowerDog; Reliable Clustered Computing; WatchDog; clustered configuration; component status; electrical power; high availability; high reliability; off the shelf computers; reliability; Application software; Availability; Computer industry; Condition monitoring; Fault detection; Hardware; Maintenance; Telecommunication computing; Virtual machine monitors; Virtual machining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fault-Tolerant Computing, 1997. FTCS-27. Digest of Papers., Twenty-Seventh Annual International Symposium on
Conference_Location :
Seattle, WA, USA
ISSN :
0731-3071
Print_ISBN :
0-8186-7831-3
Type :
conf
DOI :
10.1109/FTCS.1997.614105
Filename :
614105
Link To Document :
بازگشت