DocumentCode
692510
Title
Algorithmic GPGPU memory optimization
Author
Byunghyun Jang ; Minsu Choi ; Kyung Ki Kim
Author_Institution
Comput. & Inf. Sci, Univ. of Mississippi, Oxford, MS, USA
fYear
2013
fDate
17-19 Nov. 2013
Firstpage
154
Lastpage
157
Abstract
The performance of General-Purpose computation on Graphics Processing Units (GPGPU) is heavily dependent on the memory access behavior. In this paper, we present an algorithmic methodology to semi-automatically find the best mapping of memory accesses present in serial loop nest to underlying data-parallel architectures based on a comprehensive static memory access pattern analysis. To that end we present a simple, yet powerful, mathematical model that captures all memory access pattern information present in serial data-parallel loop nests. We then show how this model is used in practice to select the most appropriate memory space for data and to search for an appropriate thread mapping and work group size from a large design space. Our experimental results are reported using the industry standard heterogeneous programming language, OpenCL, targeting the NVIDIA GT200 architecture. The full version of the paper can be found at [1].
Keywords
graphics processing units; parallel architectures; programming languages; NVIDIA GT200 architecture; OpenCL; algorithmic GPGPU memory optimization; algorithmic methodology; comprehensive static memory access pattern analysis; data-parallel architectures; general-purpose computation on graphics processing units; heterogeneous programming language; memory access behavior; memory access pattern information; memory accesses mapping; memory space; serial data-parallel loop nests; serial loop nest; thread mapping; Analytical models; Graphics processing units; Hardware; Instruction sets; Kernel; Programming; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
SoC Design Conference (ISOCC), 2013 International
Conference_Location
Busan
Type
conf
DOI
10.1109/ISOCC.2013.6863959
Filename
6863959
Link To Document