DocumentCode :
3199134
Title :
A Scheduling and Runtime Framework for a Cluster of Heterogeneous Machines with Multiple Accelerators
Author :
Beri, Tarun ; Bansal, Sorav ; Kumar, Subodh
Author_Institution :
Indian Inst. of Technol. Delhi, New Delhi, India
fYear :
2015
fDate :
25-29 May 2015
Firstpage :
146
Lastpage :
155
Abstract :
We present a runtime system for simple and efficient programming of CPU+GPU clusters. The programmer focuses on core logic, while the system undertakes task allocation, load balancing, scheduling, data transfer, etc. Our programming model is based on a shared global address space, made efficient by transaction style bulk-synchronous semantics. This model broadly targets coarse-grained data parallel computation particularly suited to multi-GPU heterogeneous clusters. We describe our computation and communication scheduling system and report its performance ona few prototype applications. For example, parallelization of matrix multiplication or 2D FFT using our system requires the regular CPU/GPU implementations and about 30 lines of additional C code to set up the runtime. Our runtime system achieves a performance of 5.61 TFlop/s while multiplying two square matrices of 1.56 billion elements each over a 10-nodecluster with 20 GPUs. This performance is possible due toa number of critical optimizations working in concert. These include perfecting, pipelining, maximizing overlap between computation and communication, and scheduling efficiently across heterogeneous devices of vastly different capacities.
Keywords :
graphics processing units; parallel processing; resource allocation; scheduling; CPU+GPU cluster programming; accelerator; data transfer; graphics processing unit; heterogeneous machine; high-performance computing; load balancing; runtime framework; scheduling framework; task allocation; transaction style bulk-synchronous semantics; Data transfer; Graphics processing units; Kernel; Message systems; Programming; Runtime; Subscriptions; Heterogeneous Architectures; High Performance Computing; Hybrid CPU-GPU Clusters; Multi Scheduling; Work Stealing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International
Conference_Location :
Hyderabad
ISSN :
1530-2075
Type :
conf
DOI :
10.1109/IPDPS.2015.12
Filename :
7161504
Link To Document :
بازگشت