Title :
Optimizing virtual machine scheduling in NUMA multicore systems
Author :
Jia Rao ; Kun Wang ; Xiaobo Zhou ; Cheng-Zhong Xu
Abstract :
An increasing number of new multicore systems use the Non-Uniform Memory Access architecture due to its scalable memory performance. However, the complex interplay among data locality, contention on shared on-chip memory resources, and cross-node data sharing overhead, makes the delivery of an optimal and predictable program performance difficult. Virtualization further complicates the scheduling problem. Due to abstract and inaccurate mappings from virtual hardware to machine hardware, program and system-level optimizations are often not effective within virtual machines. We find that the penalty to access the “uncore” memory subsystem is an effective metric to predict program performance in NUMA multicore systems. Based on this metric, we add NUMA awareness to the virtual machine scheduling. We propose a Bias Random vCPU Migration (BRM) algorithm that dynamically migrates vCPUs to minimize the system-wide uncore penalty. We have implemented the scheme in the Xen virtual machine monitor. Experiment results on a two-way Intel NUMA multicore system with various workloads show that BRM is able to improve application performance by up to 31.7% compared with the default Xen credit scheduler. Moreover, BRM achieves predictable performance with, on average, no more than 2% runtime variations.
Keywords :
memory architecture; multiprocessing systems; performance evaluation; processor scheduling; virtual machines; BRM algorithm; Xen credit scheduler; Xen virtual machine monitor; bias random vCPU migration algorithm; complex interplay; cross-node data sharing overhead; data locality; machine hardware; memory subsystem; nonuniform memory access architecture; optimal program performance; predictable program performance; scalable memory performance; shared on-chip memory resources; system-level optimizations; system-wide uncore penalty; two-way Intel NUMA multicore system; virtual hardware; virtual machine scheduling optimization; Benchmark testing; Hardware; Instruction sets; Multicore processing; Sockets; Topology;
Conference_Titel :
High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on
Conference_Location :
Shenzhen
Print_ISBN :
978-1-4673-5585-8
DOI :
10.1109/HPCA.2013.6522328