Title :
Computation and Communication Aware task graph Scheduling on multi-GPU systems
Author :
Wang, Yun-Ting ; Lee, Jia-Ying ; Lai, Bo-Cheng Charles
Author_Institution :
Dept. of Taiwan Semiconductor Manufacturing Co., Hsinchu, Taiwan
Abstract :
GPUs have emerged as popular throughput computing platforms due to the massively parallel computing capability and low cost. To attain further performance enhancement beyond single GPU, there is a growing interest in exploiting systems with multiple GPUs. Attaining superior performance in a multi-GPU system involves three main design challenges, namely load balance, memory utilization, and data transfer. Imbalanced loading across a system could cause idling of GPUs while poor data reuse would trigger excessive memory accesses. The inefficient data transfer between a host and a device becomes a considerable performance overhead during high throughput computing. This paper aims at addressing the above design issues by proposing a Computation and Communication Aware task graph Scheduling (CCAS) for multi-GPU systems. The proposed scheduling approach (CCAS) adopts an effective heuristic algorithm that considers both data reuse and load balance in a multi-GPU system. The data transfer overhead is hidden by extensively overlapping computation and data communication. The experimental results of the proposed CCAS have demonstrated an average of 22.15% performance enhancement when compared with a previous work.
Keywords :
Data transfer; Graphics processing units; Kernel; Performance evaluation; Processor scheduling; Scheduling; Throughput; GPUs; Scheduling; Task Graph;
Conference_Titel :
Digital Signal Processing (DSP), 2015 IEEE International Conference on
Conference_Location :
Singapore, Singapore
DOI :
10.1109/ICDSP.2015.7251841