Title :
Symphony: A Scheduler for Client-Server Applications on Coprocessor-Based Heterogeneous Clusters
Author :
Rafique, M. Mustafa ; Cadambi, Srihari ; Rao, Kunal ; Butt, Ali R. ; Chakradhar, Srimat
Author_Institution :
Dept. of Comput. Sci., Virginia Tech, Blacksburg, VA, USA
Abstract :
Coprocessors such as GPUs are increasingly being deployed in clusters to process scientific and compute-intensive jobs. In this work, we study if GPU-based heterogeneous clusters can benefit client-server applications. Specifically, we consider the practical situation where multiple client-server applications share a heterogeneous cluster (multi-tenancy), and experience unpredictable variations in incoming client request rates, including steep load spikes. Even for "compute-intensive" client-server applications, it is unclear if a GPU-based cluster can seamlessly deliver acceptable response times in the presence of multi-tenancy and load spikes. We argue that a cluster-level scheduler that is aware of application load, request deadlines and the heterogeneity is necessary in this situation. We propose a novel scheduler called Symphony that enables efficient, dynamic sharing of a GPU-based heterogeneous cluster across multiple concurrently-executing client-server applications, each with arbitrary load spikes. Symphony performs three key tasks: it (i) monitors the load on each application, (ii) collects past performance data and dynamically builds simple performance models of available processing resources and (iii) computes a priority for pending requests based on the above parameters and the requests\´ slack. Based on this, it reorders client requests across different applications to achieve acceptable response times. We also define how client-server applications should interact with a scheduler such as Symphony, and develop an API to this end. We deploy Symphony as user-space middleware on a high-end heterogeneous cluster with dual quad-core Xeon CPUs and dual NVIDIA Fermi GPUs. An evaluation using representative applications shows that in the presence of load spikes (i) Symphony incurs 2-20× fewer requests that do not meet response time constraints compared with other schedulers, and (ii) in order to achieve the same performance as Symphony, other scheduler- - s need 2× more cluster nodes.
Keywords :
computer graphic equipment; coprocessors; middleware; processor scheduling; resource allocation; API; GPU-based heterogeneous cluster; Symphony; application load awareness; client request rate; client-server application scheduler; coprocessor-based heterogeneous clusters; dual NVIDIA Fermi GPU; dual quad-core Xeon CPU; dynamic cluster sharing; load spike; multiple concurrently-executing client-server applications; multitenancy; performance data; performance model; processing resource; request deadline; request priority; request slack; user-space middleware; Graphics processing unit; History; Kernel; Measurement; Processor scheduling; Scheduling; Time factors;
Conference_Titel :
Cluster Computing (CLUSTER), 2011 IEEE International Conference on
Conference_Location :
Austin, TX
Print_ISBN :
978-1-4577-1355-2
Electronic_ISBN :
978-0-7695-4516-5
DOI :
10.1109/CLUSTER.2011.46