DocumentCode :
2037399
Title :
Design and implementation of a parallel priority queue on many-core architectures
Author :
Xi He ; Agarwal, Deborah ; Prasad, Sushil K.
Author_Institution :
Dept. of Comput. Sci., Georgia State Univ., Atlanta, GA, USA
fYear :
2012
fDate :
18-22 Dec. 2012
Firstpage :
1
Lastpage :
10
Abstract :
An efficient parallel priority queue is at the core of the effort in parallelizing important non-numeric irregular computations such as discrete event simulation scheduling and branch-and-bound algorithms. GPGPUs can provide powerful computing platform for such non-numeric computations if an efficient parallel priority queue implementation is available. In this paper, aiming at fine-grained applications, we develop an efficient parallel heap system employing CUDA. To our knowledge, this is the first parallel priority queue implementation on many-core architectures, thus represents a breakthrough. By allowing wide heap nodes to enable thousands of simultaneous deletions of highest priority items and insertions of new items, and taking full advantage of CUDA´s data parallel SIMT architecture, we demonstrate up to 30-fold absolute speedup for relatively fine-grained compute loads compared to optimized sequential priority queue implementation on fast multicores. Compared to this, our optimized multicore parallelization of parallel heap yields only 2-3 fold speedup for such fine-grained loads. This parallelization of a tree-based data structure on GPGPUs provides a roadmap for future parallelizations of other such data structures.
Keywords :
data structures; graphics processing units; multiprocessing systems; parallel architectures; queueing theory; CUDA; GPGPU; branch-and-bound algorithms; data parallel SIMT architecture; data structures; discrete event simulation scheduling; fast multicores; fine-grained applications; fine-grained compute loads; fine-grained loads; many-core architectures; nonnumeric computations; nonnumeric irregular computations; optimized multicore parallelization; optimized sequential priority queue implementation; parallel heap system; parallel priority queue design; parallel priority queue implementation; tree-based data structure; wide heap nodes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High Performance Computing (HiPC), 2012 19th International Conference on
Conference_Location :
Pune
Print_ISBN :
978-1-4673-2372-7
Electronic_ISBN :
978-1-4673-2370-3
Type :
conf
DOI :
10.1109/HiPC.2012.6507490
Filename :
6507490
Link To Document :
بازگشت