مرکز منطقه ای اطلاع رساني علوم و فناوري - Understanding the virtualization "Tax" of scale-out pass-through GPUs in GaaS clouds: An empirical study

DocumentCode :

1949492

Title :

Understanding the virtualization "Tax" of scale-out pass-through GPUs in GaaS clouds: An empirical study

Author :

Ming Liu ; Tao Li ; Neo Jia ; Currid, Andy ; Troy, Vladimir

Author_Institution :

Dept. of Electr. & Comput. Eng., Univ. of Florida, Gainesville, FL, USA

fYear :

2015

fDate :

7-11 Feb. 2015

Firstpage :

259

Lastpage :

270

Abstract :

Pass-through techniques enable virtual machines to directly access hardware GPU resources in an exclusive mode, delivering extraordinary graphics performance for client users in GaaS clouds. However, the virtualization overheads of pass-through GPUs may decrease the frame rate of graphics workloads by reducing the occupancy rate of the GPU working queue. In this work, we make the first attempt to characterize pass-through GPUs running in different consolidation scenarios and uncover the root causes of these overheads. Towards this end, we set up state-of-the-art empirical platforms equipped with NVIDIA GRID GPUs and execute graphics intensive workloads running in GaaS clouds. We first demonstrate the existence of virtualization overheads, which can slow down the GPU command generation rate. Compared with a bare-metal system, the performance of pass-through GPUs degrades 9.0% and 21.5% under a single VM and 8-VMs respectively. We analyze the workflow of Windows display driver model and VMEXIT events distribution and identify four factors (i.e. HLT instruction and idle domain, external interrupt delivery, IOMMU, and memory subsystem) that contribute to the performance degradation. Our evaluation results show that: (1) the VM-VMM context switch caused by a HLT instruction and wake-up interrupt injection of an idle domain result in 66. 7% idle time for a single pass-through GPU; (2) the external interrupt delivery and tasklet processing cause additional overheads. When 8 VMs are consolidated, the interrupt delivery processing time and interrupt frequency rise 30.7% and 127.3%, respectively; (3) the existing IOMMU design scales well with pass-through GPUs; and (4) interactions of domain guest´s software stacks impact the hardware prefetching mechanism so that it fails to compensate the rapidly growing LLC miss rate when more pass-through GPU VMs are added. To the best of our knowledge, this is the first work that characterizes pass-through GPU virtualization overheads - nd underlying reasons. This study highlights valuable insights for improving the performance of future virtualized GPU systems.

Keywords :

cloud computing; graphics processing units; storage management; virtual machines; GPU working queue; GaaS cloud; HLT instruction and idle domain; NVIDIA GRID GPU; VMEXIT event distribution; Windows display driver model; external interrupt delivery; graphics workload; graphics-as-a-service; hardware prefetching mechanism; occupancy rate; scale-out pass-through GPU; virtual machine; virtualized GPU system; Degradation; Graphics processing units; Hardware; Performance evaluation; Sockets; Virtual machine monitors; Virtualization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

High Performance Computer Architecture (HPCA), 2015 IEEE 21st International Symposium on

Conference_Location :

Burlingame, CA

Type :

conf

DOI :

10.1109/HPCA.2015.7056038

Filename :

7056038

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1949492