DocumentCode :
167467
Title :
KernelGen -- The Design and Implementation of a Next Generation Compiler Platform for Accelerating Numerical Models on GPUs
Author :
Mikushin, Dmitry ; Likhogrud, Nikolay ; Zhang, Eddy Z. ; Bergstrom, Christopher
Author_Institution :
Inst. of Comput. Sci., Univ. of Lugano, Lugano, Switzerland
fYear :
2014
fDate :
19-23 May 2014
Firstpage :
1011
Lastpage :
1020
Abstract :
GPUs are becoming pervasive in scientific computing. Originally served as peripheral accelerators, now they are gradually turning into central computing nodes. However, most current directive-based approaches for parallelizing sequential legacy code such as OpenACC and HMPP simply off-load "hot" CPU code onto GPUs, entailing a lot of limitations such as unsupported external calls and coarse-grained data dependence analysis. This paper introduces KernelGen, which is a parallelization framework with a robust parallelism detection mechanism and a novel GPU-centric execution model. KernelGen supports the major scientific programming languages including C and Fortran, and has multiple backends that can generate target code for both X86 CPUs and NVIDIA GPUs. The efficiency of KernelGen has been demonstrated by the performance improvement up to 5.4× compared with three major commercial OpenACC compilers over a benchmark suite of numerical kernels.
Keywords :
graphics processing units; natural sciences computing; parallel processing; program compilers; C; CPU code; Fortran; GPU-centric execution model; HMPP; KernelGen; NVIDIA GPU; OpenACC; X86 CPU; central computing nodes; coarse-grained data dependence analysis; commercial OpenACC compilers; directive-based approaches; next generation compiler platform; numerical model acceleration; peripheral accelerators; robust parallelism detection mechanism; scientific programming languages; sequential legacy code paralleliziation; unsupported external calls; DSL; Graphics processing units; Kernel; Parallel processing; Pipelines; Programming; Runtime; GPU; JIT-compilation; LLVM; OpenACC; stencils;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
Type :
conf
DOI :
10.1109/IPDPSW.2014.115
Filename :
6969492
Link To Document :
بازگشت