مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

1996730

Title :

Dynamic Sharing of GPUs in Cloud Systems

Author :

Diab, Khaled M. ; Rafique, M. Mustafa ; Hefeeda, Mohamed

Author_Institution :

Qatar Comput. Res. Inst. (QCRI), Qatar Found., Doha, Qatar

fYear :

2013

fDate :

20-24 May 2013

Firstpage :

947

Lastpage :

954

Abstract :

The use of computational accelerators, specifically programmable GPUs, is becoming popular in cloud computing environments. Cloud vendors currently provide GPUs as dedicated resources to cloud users, which may result in under-utilization of the expensive GPU resources. In this work, we propose gCloud, a framework to provide GPUs as on-demand computing resources to cloud users. gCloud allows on-demand access to local and remote GPUs to cloud users only when the target GPU kernel is ready for execution. In order to improve the utilization of GPUs, gCloud efficiently shares the GPU resources among concurrent applications from different cloud users. Moreover, it reduces the inter-application interference of concurrent kernels for GPU resources by considering the local and global memory, number of threads, and the number of thread blocks of each kernel. It schedules concurrent kernels on available GPUs such that the overall inter-application interference across the cluster is minimal. We implemented gCloud as an independent module, and integrated it with the Open Stack cloud computing platform. Evaluation of gCloud using representative applications shows that it improves the utilization of GPU resources by 56.3% on average compared to the current state-of-the-art systems that serialize GPU kernel executions. Moreover, gCloud significantly reduces the completion time of GPU applications, e.g., in our experiments of running a mix of 8 to 28 GPU applications on 4 NVIDIA Tesla GPUs, gCloud achieves up to 430% reduction in the total completion time.

Keywords :

cloud computing; graphics processing units; Cloud Systems; Dynamic Sharing; GPU kernel executions; cloud computing environments; computational accelerators; concurrent applications; expensive GPU resources; gCloud; global memory; inter-application interference; local memory; on-demand computing resources; open stack cloud computing platform; Cloud computing; Context; Graphics processing units; Instruction sets; Kernel; Memory management; Message systems;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International

Conference_Location :

Cambridge, MA

Print_ISBN :

978-0-7695-4979-8

Type :

conf

DOI :

10.1109/IPDPSW.2013.102

Filename :

6650978

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1996730