Title :
Scheduling Multi-tenant Cloud Workloads on Accelerator-Based Systems
Author :
Sengupta, Dipak ; Goswami, Anshuman ; Schwan, Karsten ; Pallavi, Krishna
Author_Institution :
Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA, USA
Abstract :
Accelerator-based systems are making rapid inroads into becoming platforms of choice for high end cloud services. There is a need therefore, to move from the current model in which high performance applications explicitly and programmatically select the GPU devices on which to run, to a dynamic model where GPUs are treated as first class schedulable entities. The Strings scheduler realizes this vision by decomposing the GPU scheduling problem into a combination of load balancing and per-device scheduling. (i) Device-level scheduling efficiently uses all of a GPU´s hardware resources, including its computational and data movement engines, and (ii) load balancing goes beyond obtaining high throughput, to ensure fairness through prioritizing GPU requests that have attained least service. With its methods, Strings achieves improvements in system throughput and fairness of up to 8.70× and 13%, respectively, compared to the CUDA runtime.
Keywords :
cloud computing; graphics processing units; parallel processing; resource allocation; scheduling; GPU scheduling problem; accelerator-based systems; data movement engines; device-level scheduling; dynamic model; high end cloud services; high performance applications; load balancing; multitenant cloud workload scheduling; per-device scheduling; strings scheduler; Context; Graphics processing units; Processor scheduling; Runtime; Servers; Switches; Synchronization; CUDA; GPU; Multi-tenancy; hierarchical scheduling; runtime systems; virtualization;
Conference_Titel :
High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for
Conference_Location :
New Orleans, LA
Print_ISBN :
978-1-4799-5499-5