Title :
Fault tolerance management in IaaS clouds
Author :
Jhawar, Ravi ; Piuri, V.
Author_Institution :
Dipt. di Inf., Univ. degli Studi di Milano, Crema, Italy
Abstract :
Fault tolerance, reliability and availability in Cloud computing are critical to ensure correct and continuous system operation also in the presence of failures. In this paper, we present an approach to evaluate fault tolerance mechanisms that use the virtualization technology to transparently increase the reliability and availability of applications deployed in the virtual machines in a Cloud. In contrast to several existing solutions that assume independent failures, we take into account the failure behavior of various server components, network and power distribution in a typical Cloud computing infrastructure, the correlation between individual failures, and the impact of each failure on user´s applications. We use this evaluation to study fault tolerance mechanisms under different deployment contexts, and use it as the basis to develop a methodology for identifying and selecting mechanisms that match user´s fault tolerance requirements.
Keywords :
cloud computing; failure analysis; fault tolerance; virtual machines; IaaS clouds; cloud computing infrastructure; continuous system operation; failure behavior; fault tolerance management; fault tolerance mechanism; fault tolerance requirements; power distribution; virtual machines; virtualization technology; Availability; Fault tolerance; Fault tolerant systems; Fault trees; Markov processes; Servers; Fault Tolerance Management; Fault Tolerance as a Service; Infrastructure Clouds;
Conference_Titel :
Satellite Telecommunications (ESTEL), 2012 IEEE First AESS European Conference on
Conference_Location :
Rome
Print_ISBN :
978-1-4673-4687-0
Electronic_ISBN :
978-1-4673-4686-3
DOI :
10.1109/ESTEL.2012.6400113