DocumentCode :
3062191
Title :
Workload Adaptive Checkpoint Scheduling of Virtual Machine Replication
Author :
Gerofi, Balazs ; Ishikawa, Yutaka
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
fYear :
2011
fDate :
12-14 Dec. 2011
Firstpage :
204
Lastpage :
213
Abstract :
Checkpoint-recovery based Virtual Machine (VM) replication is an emerging approach towards accommodating VM installations with high availability, especially, due to its inherent capability of tackling with symmetric multiprocessing (SMP) virtual machines, i.e. VMs with multiple virtual CPUs (vCPUs). However, it comes with the price of significant performance degradation of the application executed in the VM because of the large amount of state that needs to be synchronized between the primary and the backup machines. Previous research improving VM replication performance focused primarily on decreasing the amount of data transferred over the network, while relying on constant checkpoint frequency. Our goal is to investigate how and to what extent performance degradation can be mitigated by adjusting the checkpoint period dynamically. We provide a comprehensive analysis of various workloads from the aspect of VM replication, paying special attention to their behavior over the increasing number of vCPUs in the system. We propose several heuristics for scheduling replication checkpoints in order to improve quality of service. Our algorithm adapts dynamically to the properties of the workload being executed in the VM, such as changes in the number of dirtied memory pages, network and disk I/O operations, as well as to the network bandwidth available for replication. We evaluate our scheduling algorithm over two network architectures, Gigabit Ethernet and Infiniband, a high-performance interconnect fabric. We find that checkpoint scheduling has a great impact on the performance of replicated virtual machines, and show that replicated virtual machines with up to 16 vCPUs can attain performance close to the native VM execution, not only over high-performance, but also over commercial network architectures.
Keywords :
checkpointing; local area networks; multiprocessing systems; scheduling; virtual machines; Gigabit Ethernet; Infiniband; backup machines; checkpoint-recovery; disk I-O operations; memory pages; network bandwidth; performance degradation; primary machines; quality of service; symmetric multiprocessing virtual machines; virtual CPU; virtual machine replication; workload adaptive checkpoint scheduling; Banking; Benchmark testing; Fault tolerant systems; Kernel; Processor scheduling; Random access memory; Virtual machining; Checkpoint-Recovery; Fault-Tolerance; Hypervisor; Replication; Virtualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Dependable Computing (PRDC), 2011 IEEE 17th Pacific Rim International Symposium on
Conference_Location :
Pasadena, CA
Print_ISBN :
978-1-4577-2005-5
Electronic_ISBN :
978-0-7695-4590-5
Type :
conf
DOI :
10.1109/PRDC.2011.32
Filename :
6133082
Link To Document :
بازگشت