Title :
PADS: A Pattern-Driven Stencil Compiler-Based Tool for Reuse of Optimizations on GPGPUs
Author :
Han, Dongni ; Xu, Shixiong ; Chen, Li ; Huang, Lei
Author_Institution :
Key Lab. of Comput. Syst. & Archit., Inst. of Comput. Technol., Beijing, China
Abstract :
Stencil computations are core of wide range of scientific and engineering applications. A lot of efforts have been put into improving efficiency of stencil calculations on different platforms, but unfortunately it is not easy to reuse. In this paper we present a PAttern-Driven Stencil compiler-based tool and a simple tuning system to reuse those well optimized methods and codes. We also suggest extensions to OpenMP, depicting high-level data structures in order to facilitate recognition of various stencil computation patterns. The PADS allows programmers to rewrite kernel of stencils or reuse source-to-source translator outputs as optimized stencil template codes with related tuning parameters, In addition, PADS consists of a OpenMP to CUDA translator and code generator using optimized template codes. It also obtains architecture-specific parameters to tune stencils across different GPU platforms. To demonstrate our system flexibility and performance portability, we illustrate four different stencil computations, Laplacian operator with Jacobi iterative method, divergence operator, 3 dimension 25 point stencil and a 2D heat equation using ADI method with periodic boundary conditions. PADS succeeds in generating all these four stencil codes using different optimization strategies and delivers a promising performance improvement.
Keywords :
graphics processing units; iterative methods; multiprocessing systems; optimisation; program compilers; program interpreters; GPGPU; Jacobi iterative method; Laplacian operator; OpenMP; data structure; divergence operator; general-purpose graphics processing unit; heat equation; multiprocessing system; optimization reuse; optimization strategy; pattern-driven stencil compiler; periodic boundary condition; source-to-source translator; stencil computation; stencil template code; stencil tuning; Generators; Kernel; Libraries; Optimization; Pattern matching; Tuning; GPGPU; OpenMP; optimization reuse; pattern matching; stencil computation;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2011 IEEE 17th International Conference on
Conference_Location :
Tainan
Print_ISBN :
978-1-4577-1875-5
DOI :
10.1109/ICPADS.2011.94