Title :
Transparent GPU Execution of NumPy Applications
Author :
Blum, Troels ; Kristensen, Mads R. B. ; Vinter, Brian
Author_Institution :
Niels Bohr Inst., Univ. of Copenhagen, Copenhagen, Denmark
Abstract :
In this work, we present a back-end for the Python library NumPy that utilizes the GPU seamlessly. We use dynamic code generation to generate kernels, and data is moved transparently to and from the GPU. For the integration into NumPy, we use the Bohrium runtime system. Bohrium hooks into NumPy through the implicit data parallelization of array operations, this approach requires no annotations or other code modifications. The key motivation for our GPU computation back-end is to transform high-level Python/NumPy applications to the lowlevel GPU executable kernels, with the goal of obtaining highperformance, high-productivity and high-portability, HP3. We provide a performance study of the GPU back-end that includes four well-known benchmark applications, Black-Scholes, Successive Over-relaxation, Shallow Water, and N-body, implemented in pure Python/NumPy. We demonstrate an impressive 834 times speed up for the Black-Scholes application, and an average speedup of 124 times across the four benchmarks.
Keywords :
graphics processing units; program compilers; software libraries; Black-Scholes applications; Bohrium runtime system; GPU computation back-end; N-body applications; Python library NumPy applications; array operation data parallelization; dynamic code generation; kernel generation; low-level GPU executable kernels; shallow water applications; successive overrelaxation applications; transparent GPU execution; Arrays; Bridges; Engines; Graphics processing units; Kernel; Libraries; Vectors; Code Generation; Computational Science; GPU; JIT; Python/NumPy;
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
DOI :
10.1109/IPDPSW.2014.114