Title :
MSSM: An Efficient Scheduling Mechanism for CUDA Basing on Task Partition
Author :
Cheng Luo ; Suda, Ryutaro
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Univ. of Tokyo, Tokyo, Japan
Abstract :
This paper presents a multiple stream scheduling mechanism to enable parallel execution of kernels, data sending from host to device and data receiving from device to host with multiple streams in CUDA. Our mechanism can divide the kernels and bi-directional data transmission into small subtasks, and allow to easily and efficiently overlap them on the CUDA compatible graphic processing unit(GPU). To set the optimal subtask size, we have built one compute bound model for computing intensive application and one data bound model for bi-directional data transmission intensive application. Basing on the two models, we also provided three scheduling algorithms for data dependent and data independent applications to maximize the efficiency of the overlap. We have applied the mechanism to a set of benchmarks to understand the performance. The results show that our work can successfully hide the latency to achieve high performance which is very close to the optimal.
Keywords :
graphics processing units; parallel architectures; processor scheduling; CUDA; GPU; MSSM; bidirectional data transmission; efficient scheduling mechanism; graphic processing unit; kernel data transmission; multiple stream scheduling mechanism; task partition; Bidirectional control; Data transfer; Dynamic scheduling; Equations; Graphics processing units; Kernel; Mathematical model; CUDA; GPU; Stream scheduling; parallelism; subtask overlap;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2012 IEEE 18th International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4673-4565-1
Electronic_ISBN :
1521-9097
DOI :
10.1109/ICPADS.2012.80