مرکز منطقه ای اطلاع رساني علوم و فناوري - Optimizing virtual machine scheduling in NUMA multicore systems

DocumentCode :

602611

Title :

Optimizing virtual machine scheduling in NUMA multicore systems

Author :

Jia Rao ; Kun Wang ; Xiaobo Zhou ; Cheng-Zhong Xu

fYear :

2013

fDate :

23-27 Feb. 2013

Firstpage :

306

Lastpage :

317

Abstract :

An increasing number of new multicore systems use the Non-Uniform Memory Access architecture due to its scalable memory performance. However, the complex interplay among data locality, contention on shared on-chip memory resources, and cross-node data sharing overhead, makes the delivery of an optimal and predictable program performance difficult. Virtualization further complicates the scheduling problem. Due to abstract and inaccurate mappings from virtual hardware to machine hardware, program and system-level optimizations are often not effective within virtual machines. We find that the penalty to access the “uncore” memory subsystem is an effective metric to predict program performance in NUMA multicore systems. Based on this metric, we add NUMA awareness to the virtual machine scheduling. We propose a Bias Random vCPU Migration (BRM) algorithm that dynamically migrates vCPUs to minimize the system-wide uncore penalty. We have implemented the scheme in the Xen virtual machine monitor. Experiment results on a two-way Intel NUMA multicore system with various workloads show that BRM is able to improve application performance by up to 31.7% compared with the default Xen credit scheduler. Moreover, BRM achieves predictable performance with, on average, no more than 2% runtime variations.

Keywords :

memory architecture; multiprocessing systems; performance evaluation; processor scheduling; virtual machines; BRM algorithm; Xen credit scheduler; Xen virtual machine monitor; bias random vCPU migration algorithm; complex interplay; cross-node data sharing overhead; data locality; machine hardware; memory subsystem; nonuniform memory access architecture; optimal program performance; predictable program performance; scalable memory performance; shared on-chip memory resources; system-level optimizations; system-wide uncore penalty; two-way Intel NUMA multicore system; virtual hardware; virtual machine scheduling optimization; Benchmark testing; Hardware; Instruction sets; Multicore processing; Sockets; Topology;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computer Architecture (HPCA2013), 2013 IEEE 19th International Symposium on

Conference_Location :

Shenzhen

ISSN :

1530-0897

Print_ISBN :

978-1-4673-5585-8

Type :

conf

DOI :

10.1109/HPCA.2013.6522328

Filename :

6522328

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=602611