Title :
Building a High Serviceability Model by Checkpointing and Replication Strategy in Cloud Computing Environments
Author :
Sun, Dawei ; Chang, Guiran ; Miao, Changsheng ; Wang, Xingwei
Author_Institution :
Sch. of Inf. Sci. & Eng., Northeastern Univ., Shenyang, China
Abstract :
High fault tolerance issue is one of the major obstacles for opening up the new era of high serviceability cloud computing as fault tolerance plays a key role in order to ensure cloud serviceability. In most current clouds, check pointing, the process of saving application states, and replication, the process of replicating hot data, usually to stable storage, have been the two most common fault tolerance strategies. However, when, where, and how often to insert check pointing or to replicate hot data have become challenges and are ignored in clouds. In this paper, the definitions of fault, error, and failure in a cloud are given, a high serviceability model by check pointing and replication strategy HSCR is put forward. It includes: (1) analyzing the mathematical relationship between different failure rates and two different fault tolerance strategies, which are check pointing fault tolerance strategy and data replication fault tolerance strategy, (2) building a high serviceability check pointing fault tolerance model and a high serviceability replication fault tolerance model by combining the two fault tolerance models together to maximize the serviceability and meet the SLOs. Experimental results conclusively demonstrate that the high serviceability model HSCR has high potential as it provides efficient fault tolerance enhancements, significant cloud serviceability improvement, and great SLOs satisfaction.
Keywords :
checkpointing; cloud computing; software fault tolerance; HSCR; SLO satisfaction; checkpointing; cloud computing environment; cloud serviceability improvement; failure rate; fault tolerance; hot data replication; mathematical relationship; replication strategy; serviceability model; stable storage; Analytical models; Checkpointing; Cloud computing; Computational modeling; Density functional theory; Fault tolerance; Fault tolerant systems; checkpointing model; cloud computing; high serviceability; replication model;
Conference_Titel :
Distributed Computing Systems Workshops (ICDCSW), 2012 32nd International Conference on
Conference_Location :
Macau
Print_ISBN :
978-1-4673-1423-7
DOI :
10.1109/ICDCSW.2012.6