Title :
Availability requirement for a fault-management server in high-availability communication systems
Author :
Sun, Hairong ; Han, James J. ; Levendel, Haim
Author_Institution :
High Reliability & Availability Technol. Center, Motorola, Deer Park, IL, USA
fDate :
6/1/2003 12:00:00 AM
Abstract :
This paper investigates the availability requirement for the fault management server in high-availability communication systems. This study shows that the availability of the fault management server does not need to be 99.999% in order to guarantee a 99.999% system availability, as long as the fail-safe ratio (the probability that the failure of the fault management server does not bring down the system) and the fault coverage ratio (probability that the failure in the system can be detected and recovered by the fault management server) are sufficiently high. Tradeoffs can be made among the availability of the fault management server, the fail-safe ratio, and the fault coverage ratio to optimize system availability. A cost-effective design for the fault management server is proposed.
Keywords :
Markov processes; computer network management; network servers; probability; telecommunication network management; telecommunication network reliability; Markov model; availability requirement; fail-safe ratio; fault coverage ratio; fault management server failure; fault-management server; high-availability communication systems; probability; Availability; Cost function; Fault detection; Fault tolerance; Network servers; Power supplies; Power system management; Software performance; Sun; Telecommunication traffic;
Journal_Title :
Reliability, IEEE Transactions on
DOI :
10.1109/TR.2003.812624