DocumentCode :
602611
Title :
Optimizing virtual machine scheduling in NUMA multicore systems
Author :
Jia Rao ; Kun Wang ; Xiaobo Zhou ; Cheng-Zhong Xu
fYear :
2013
fDate :
23-27 Feb. 2013
Firstpage :
306
Lastpage :
317
Abstract :
An increasing number of new multicore systems use the Non-Uniform Memory Access architecture due to its scalable memory performance. However, the complex interplay among data locality, contention on shared on-chip memory resources, and cross-node data sharing overhead, makes the delivery of an optimal and predictable program performance difficult. Virtualization further complicates the scheduling problem. Due to abstract and inaccurate mappings from virtual hardware to machine hardware, program and system-level optimizations are often not effective within virtual machines. We find that the penalty to access the “uncore” memory subsystem is an effective metric to predict program performance in NUMA multicore systems. Based on this metric, we add NUMA awareness to the virtual machine scheduling. We propose a Bias Random vCPU Migration (BRM) algorithm that dynamically migrates vCPUs to minimize the system-wide uncore penalty. We have implemented the scheme in the Xen virtual machine monitor. Experiment results on a two-way Intel NUMA multicore system with various workloads show that BRM is able to improve application performance by up to 31.7% compared with the default Xen credit scheduler. Moreover, BRM achieves predictable performance with, on average, no more than 2% runtime variations.
Keywords :
memory architecture; multiprocessing systems; performance evaluation; processor scheduling; virtual machines; BRM algorithm; Xen credit scheduler; Xen virtual machine monitor; bias random vCPU migration algorithm; complex interplay; cross-node data sharing overhead; data locality; machine hardware; memory subsystem; nonuniform memory access architecture; optimal program performance; predictable program performance; scalable memory performance; shared on-chip memory resources; system-level optimizations; system-wide uncore penalty; two-way Intel NUMA multicore system; virtual hardware; virtual machine scheduling optimization; Benchmark testing; Hardware; Instruction sets; Multicore processing; Sockets; Topology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on
Conference_Location :
Shenzhen
ISSN :
1530-0897
Print_ISBN :
978-1-4673-5585-8
Type :
conf
DOI :
10.1109/HPCA.2013.6522328
Filename :
6522328
Link To Document :
بازگشت