Title :
Proactive blocking coordinated checkpointing with dynamic intervals
Author :
Lotfi, Mehdi ; Motamedi, Seyed Ahmad ; Bandarabadi, Mojtaba
Author_Institution :
Electr. Eng. Dept., Amirkabir Univ. of Technol., Tehran
Abstract :
In this paper we introduce a new proactive blocking coordinated checkpointing for cluster computing systems with dynamic interval. Many current schemes to increase the availability of cluster computing systems either make use of redundancy in space or redundancy in time (reactive methods). These methods induce the overhead to the cluster computing system in failure free execution time. In order to minimize the performance loss (rollback and checkpoint overheads) due to unexpected failures or unnecessary overhead of fault tolerant mechanisms, we present a proactive method for the blocking coordinated checkpointing strategy. Existing checkpointing methods are static with constant checkpointing interval. These methods are based on the exponential distribution function. In this paper we use the Weibull distribution function to find the dynamic interval. Our method is based on the failure data analysis of LANL cluster system. Experimental results show that average execution time of NAS application is significantly reduced by using the proposed method.
Keywords :
Weibull distribution; checkpointing; exponential distribution; program diagnostics; software fault tolerance; software performance evaluation; workstation clusters; Weibull distribution function; cluster computing system; dynamic interval; exponential distribution function; failure free execution time; fault tolerant mechanism; performance loss minimization; proactive blocking coordinated checkpointing; static checkpointing method; Availability; Checkpointing; Clustering algorithms; Data analysis; Fault tolerance; Frequency synchronization; Hardware; Redundancy; Space technology; Weibull distribution; blocking coordinated checkpointing; dynamic interval; proactive checkpointing; weibull distribution;
Conference_Titel :
System Theory, 2009. SSST 2009. 41st Southeastern Symposium on
Conference_Location :
Tullahoma, TN
Print_ISBN :
978-1-4244-3324-7
Electronic_ISBN :
0094-2898
DOI :
10.1109/SSST.2009.4806842