Title :
A Comprehensive Performance Comparison of CUDA and OpenCL
Author :
Fang, Jianbin ; Varbanescu, Ana Lucia ; Sips, Henk
Author_Institution :
Parallel & Distrib. Syst. Group, Delft Univ. of Technol., Delft, Netherlands
Abstract :
This paper presents a comprehensive performance comparison between CUDA and OpenCL. We have selected 16 benchmarks ranging from synthetic applications to real-world ones. We make an extensive analysis of the performance gaps taking into account programming models, ptimization strategies, architectural details, and underlying compilers. Our results show that, for most applications, CUDA performs at most 30% better than OpenCL. We also show that this difference is due to unfair comparisons: in fact, OpenCL can achieve similar performance to CUDA under a fair comparison. Therefore, we define a fair comparison of the two types of applications, providing guidelines for more potential analyses. We also investigate OpenCL´s portability by running the benchmarks on other prevailing platforms with minor modifications. Overall, we conclude that OpenCL´s portability does not fundamentally affect its performance, and OpenCL can be a good alternative to CUDA.
Keywords :
benchmark testing; computer graphic equipment; coprocessors; multiprocessing systems; parallel architectures; parallel programming; parallelising compilers; CUDA; NVIDIA GPU; OpenCL portability; architectural detail; compilers; optimization strategy; performance comparison; performance gap; programming model; Benchmark testing; Computational modeling; Graphics processing unit; Kernel; Performance evaluation; Programming; CUDA; OpenCL; Performance Comparison;
Conference_Titel :
Parallel Processing (ICPP), 2011 International Conference on
Conference_Location :
Taipei City
Print_ISBN :
978-1-4577-1336-1
Electronic_ISBN :
0190-3918
DOI :
10.1109/ICPP.2011.45