Title :
Performance analysis of repairable cluster of workstations
Author :
Al-Karaki, Jamal N.
Author_Institution :
Dept. of Electr. Eng. & Comput. Eng., Iowa State Univ., Ames, IA, USA
Abstract :
Summary form only given. In this paper, an analytical model is developed and used to evaluate average response time of repairable cluster of workstations. Employing the queuing theory, closed form solutions for the response time of fault tolerant clusters of workstations are obtained. Workstations (nodes) in the cluster are divided into two sets: active set and backup set. Fault tolerance is achieved by having a set of active nodes replicate their services at a set of backup nodes. Active nodes, periodically, checkpoint their status on the backups. If an active node fails, one of the backups takes over and joins the active set. Two immediate repair mechanisms are considered to repair faulty nodes in the system. In addition to their closed form formats, the analytical results presented in this paper have several advantages over those presented in the previous work. Unlike previous work, there is no need any more to solve a set of recursive equations and the results reveal much of the characteristics of the system.
Keywords :
fault tolerance; performance evaluation; queueing theory; system recovery; workstation clusters; backup nodes; checkpoint; fault tolerance; performance analysis; queuing theory; workstation clusters; Closed-form solution; Delay; Fault tolerance; Fault tolerant systems; Multiprocessing systems; Performance analysis; Power system reliability; Queueing analysis; Supercomputers; Workstations;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International
Print_ISBN :
0-7695-2132-0
DOI :
10.1109/IPDPS.2004.1303316