Title :
Reliability Analysis of Self-Healing Network using Discrete-Event Simulation
Author :
Angskun, Thara ; Bosilca, George ; Fagg, Graham ; Pjesivac-Grbovic, J. ; Dongarra, Jack J.
Author_Institution :
Dept. of Comput. Sci., Univ. of Tennessee, Knoxville, TN
Abstract :
The number of processors embedded on high performance computing platforms is continuously increasing to accommodate user desire to solve larger and more complex problems. However, as the number of components increases, so does the probability of failure. Thus, both scalable and fault-tolerance of software are important issues in this field. To ensure reliability of the software especially under the failure circumstance, the reliability analysis is needed. The discrete-event simulation technique offers an attractive a ternative to traditional Markovian-based analytical models, which often have an intractably large state space. In this paper, we analyze reliability of a self-healing network developed for parallel runtime environments using discrete-event simulation. The network is designed to support transmission of messages across multiple nodes and at the same time, to protect against node and process failures. Results demonstrate the flexibility of a discrete-event simulation approach for studying the network behavior under failure conditions and various protocol parameters, message types, and routing algorithms.
Keywords :
computer network reliability; discrete event simulation; software fault tolerance; telecommunication network routing; discrete event simulation; failure conditions; fault toelerance; message trasnsmission; network behavior; parallel runtime environments; protocol parameters; reliability analysis; routing algorithms; self-healing network; Analytical models; Discrete event simulation; Failure analysis; Fault tolerance; Fault trees; High performance computing; Petri nets; Protection; Runtime environment; Tree graphs;
Conference_Titel :
Cluster Computing and the Grid, 2007. CCGRID 2007. Seventh IEEE International Symposium on
Conference_Location :
Rio De Janeiro
Print_ISBN :
0-7695-2833-3
DOI :
10.1109/CCGRID.2007.95