مرکز منطقه ای اطلاع رساني علوم و فناوري - GPU/CPU Work Sharing with Parallel Language XcalableMP-dev for Parallelized Accelerated Computing

DocumentCode :

1917823

Title :

GPU/CPU Work Sharing with Parallel Language XcalableMP-dev for Parallelized Accelerated Computing

Author :

Odajima, Tetsuya ; Boku, Taisuke ; Hanawa, Toshihiro ; Lee, Jinpil ; Sato, Mitsuhisa

Author_Institution :

Grad. Sch. of Syst. & Inf. Eng., Univ. of Tsukuba, Tsukuba, Japan

fYear :

2012

fDate :

10-13 Sept. 2012

Firstpage :

Lastpage :

106

Abstract :

In this paper, we propose a solution framework to enable the work sharing of parallel processing by the coordination of CPUs and GPUs on hybrid PC clusters based on the high-level parallel language XcalableMPdev. Basic XcalableMP enables high-level parallel programming using sequential code directives that support data distribution and loop/task distribution among multiple nodes on a PC cluster. XcalableMP-dev is an extension of XcalableMP for a hybrid PC cluster, where each node is equipped with accelerated computing devices such as GPUs, many-core environments, etc. Our new framework proposed here, named XcalableMP-dev/Star PU, enables the distribution of data and loop execution among multiple GPUs and multiple CPU cores on each node. We employ a Star PU run-time system for task management with dynamic load balancing. Because of the large performance gap between CPUs and GPUs, the key issue for work sharing among CPU and GPU resources is the task size control assigned to different devices. Since the compiler of the new system is still under construction, we evaluated the performance of hybrid work sharing among four nodes of a GPU cluster and confirmed that the performance gain by the traditional XcalableMP-dev system on NVIDIA CUDA is up to 1.4 times faster than GPU-only execution.

Keywords :

graphics processing units; multiprocessing systems; parallel architectures; parallel programming; performance evaluation; resource allocation; CPU; GPU; NVIDIA CUDA; Star PU run-time system; XcalableMP-dev high-level parallel language; data distribution; dynamic load balancing; high-level parallel programming; hybrid PC clusters; loop distribution; loop execution; many-core environments; parallel processing; parallelized accelerated computing; performance evaluation; sequential code directives; task distribution; task management; task size control; Acceleration; Arrays; Distributed databases; Graphics processing unit; Multicore processing; Performance evaluation; Synchronization;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel Processing Workshops (ICPPW), 2012 41st International Conference on

Conference_Location :

Pittsburgh, PA

ISSN :

1530-2016

Print_ISBN :

978-1-4673-2509-7

Type :

conf

DOI :

10.1109/ICPPW.2012.16

Filename :

6337468

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1917823