Title :
Evaluation of Successive CPUs/APUs/GPUs Based on an OpenCL Finite Difference Stencil
Author :
Calandra, H. ; Dolbeau, R. ; Fortin, Pascal ; Lamotte, Jean-Luc ; Said, I.
Author_Institution :
Total, Pau, France
fDate :
Feb. 27 2013-March 1 2013
Abstract :
The AMD APU (Accelerated Processing Unit) architecture, which combines CPU and GPU cores on the same die, is promising for GPU applications which performance is bottlenecked by the low PCI Express communication rate. However the first APU generations still have different CPU and GPU memory partitions. Currently, the APU integrated GPUs are also less powerful than discrete GPUs. In this paper we therefore investigate the interest of APUs for scientific computing by evaluating and comparing the performance of two successive AMD APUs (family codename Llano and Trinity), two successive discrete GPUs (chip codename Cayman and Tahiti) and one hexa-core AMD CPU. For this purpose, we rely on a 3D finite difference stencil, that is optimized and tuned in OpenCL. We detail the most interesting optimizations for each architecture and show very good performance in OpenCL: up to 500 Gflops on Tahiti. Finally, our results show that APU integrated GPUs outperform CPUs, and that integrated GPUs of upcoming APUs may match discrete GPUs for problems with high communication requirements.
Keywords :
computer architecture; finite difference methods; graphics processing units; performance evaluation; peripheral interfaces; 3D finite difference stencil; AMD APU architecture; APU evaluation; CPU cores; CPU evaluation; CPU memory partitions; Cayman; GPU cores; GPU evaluation; GPU memory partitions; Llano; OpenCL finite difference stencil; PCI express communication rate; Tahiti; Trinity; accelerated processing unit; graphic processing units; scientific computing; Bandwidth; Central Processing Unit; Computer architecture; Graphics processing units; Kernel; Performance evaluation; Three-dimensional displays; APU; GPU; PCI Express bus; finite difference stencil; high performance scientific computing;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2013 21st Euromicro International Conference on
Conference_Location :
Belfast
Print_ISBN :
978-1-4673-5321-2
Electronic_ISBN :
1066-6192
DOI :
10.1109/PDP.2013.65