Title of article
Performance evaluation of hybrid programming patterns for large CPU/GPU heterogeneous clusters Original Research Article
Author/Authors
Fengshun Lu، نويسنده , , Junqiang Song، نويسنده , , Fukang Yin، نويسنده , , Xiaoqian Zhu، نويسنده ,
Issue Information
ماهنامه با شماره پیاپی سال 2012
Pages
10
From page
1172
To page
1181
Abstract
The CPU/GPU heterogeneous clusters are important platforms for high performance computing applications. However, there are many challenges for efficiently performing the scientific and engineering legacy code on these heterogeneous systems. In this paper, we endeavor to address the programming-model issue by combining the existing models (i.e., MPI, OpenMP and CUDA). First, two hybrid programming patterns are presented, namely the image and image. Second, three kernels (i.e., EP, CG and MG) of the NAS parallel benchmarks (NPBs), which are abstracted from many legacy computational fluid dynamics applications, are implemented with the above two patterns. Third, these hybrid implementations are executed on the TianHe-1A supercomputer, and the corresponding experimental results show that significant performance improvement can be achieved with the above patterns. Finally, a detailed performance analysis about the two hybrid patterns is performed and some guidelines for porting the legacy code onto large-scale heterogeneous CPU/GPU clusters are also given.
Keywords
MPI , CUDA , OpenMP , GPU cluster , Performance evaluation , NPB
Journal title
Computer Physics Communications
Serial Year
2012
Journal title
Computer Physics Communications
Record number
1138576
Link To Document