DocumentCode :
1879340
Title :
Accelerating range-based loops on heterogeneous systems
Author :
Suwancharoen, Chaturapat ; Marurngsith, Worawan
Author_Institution :
Dept. of Comput. Sci., Thammasat Univ., Pathum Thani, Thailand
fYear :
2015
fDate :
28-31 Jan. 2015
Firstpage :
99
Lastpage :
104
Abstract :
Range-based loop is a powerful construct due to its clear and concise syntax. The abstraction of loop index in a range-based loop implies loop-level parallelism ready to be exploited. Despite its advantage on hidden parallelism and programmability, the magnitude of performance gain by accelerating range-based loop on heterogeneous systems is still not well studied. This paper addresses this issue and make three contributions. First, the review showing the magnitude of performance gain from CUDA/OpenCL code, generated by ten exisiting auto-parallelizing compilers is presented. Second, the performance comparison between range-based and traditional loops acceleration on four workloads from the SHOC benchmark is reported. Third, the performance limitation on using directive-based compiler to accelerate range-based loop is discussed. The results show that transforming scientific workloads to exploit range-based loops is a challenge. The review results show that code generated by auto-parallelizing achieved an average of 37±23 folds speedup relative to sequential CPU, while the proposed range-based compiler achieved higher speedup than the average (44.8±22x). The evaluation against four workloads from highly-tuned benchmark shows that range-based loop acceleration achieved in average 72% of the benchmark´s performance. This highlights range-based loops as a promising target for auto parallelizing compiling code on heterogeneous systems.
Keywords :
application program interfaces; parallel architectures; parallel programming; parallelising compilers; program control structures; software performance evaluation; CUDA/OpenCL code; SHOC benchmark; auto-parallelizing compilers; directive-based compiler; heterogeneous systems; range-based loop acceleration; Acceleration; Benchmark testing; Containers; Graphics processing units; Parallel processing; Performance evaluation; Performance gain; GPU; OpenCL; directive-based compiler; heterogeneous systems; loop parallelization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Knowledge and Smart Technology (KST), 2015 7th International Conference on
Conference_Location :
Chonburi
Print_ISBN :
978-1-4799-6048-4
Type :
conf
DOI :
10.1109/KST.2015.7051466
Filename :
7051466
Link To Document :
بازگشت