Title :
Memory Hierarchy Optimization for Large Tridiagonal System Solvers on GPU
Author :
Lamas-Rodríguez, Julián ; Argüello, Francisco ; Heras, Dora B. ; Bóo, Montserrat
Author_Institution :
Res. Center on Inf. Technol., Univ. of Santiago de Compostela, Santiago de Compostela, Spain
Abstract :
Nowadays GPUs are commodity hardware containing hundreds of cores and supporting thousands of threads that can be used to accelerate a wide range of applications. From a programmer´s perspective, GPUs offer a stream processing model which requires the application of new techniques to exploit their capabilities. In this paper we present the application of the split-and-merge technique to the following parallel tridiagonal system solvers on the GPU: cyclic reduction and recursive doubling. The split-and-merge technique naturally splits the algorithm flow in parallel paths that can be solved in shared memory, and later merged in global memory. In this way, we can solve large systems of equations efficiently exploiting the memory hierarchy of the GPU. The results obtained show a significant acceleration compared with the direct implementation of the algorithms on the GPU.
Keywords :
graphics processing units; optimisation; parallel processing; GPU; commodity hardware; cyclic reduction; large tridiagonal system solvers; memory hierarchy optimization; parallel tridiagonal system solvers; recursive doubling; split-and-merge technique; Arrays; Equations; Graphics processing unit; Instruction sets; Kernel; Mathematical model; CUDA; GPGPU; cyclic reduction; recursive doubling; tridiagonal system solver;
Conference_Titel :
Parallel and Distributed Processing with Applications (ISPA), 2012 IEEE 10th International Symposium on
Conference_Location :
Leganes
Print_ISBN :
978-1-4673-1631-6
DOI :
10.1109/ISPA.2012.20