DocumentCode :
3697205
Title :
CUDA Grid-Level Task Progression Algorithms
Author :
Christos Kartsaklis;Wayne Joubert;Oscar R. Hernandez;Markus Eisenbach;Wael R. Elwasif;David E. Bernholdt
Author_Institution :
Oak Ridge Nat. Lab., Oak Ridge, TN, USA
fYear :
2015
Firstpage :
1628
Lastpage :
1632
Abstract :
Tasking is a prominent parallel programming model. In this paper we conduct a first study into the feasibility of task-parallel execution at the CUDA grid, rather than the stream/kernel level, for regular, fixed in-out dependency task graphs, similar to those found in wavefront computational patterns, making the findings broadly applicable. We propose and evaluate three CUDA task progression algorithms, where threadblocks cooperatively process the task graph, and argue about their performance in terms of tasking throughput, atomics and memory IO overheads. Our initial results demonstrate a throughput of 38 million tasks/second on a Kepler K20X architecture.
Keywords :
"Graphics processing units","Instruction sets","Computational modeling","Parallel processing","Runtime","Radiation detectors","Kernel"
Publisher :
ieee
Conference_Titel :
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
Type :
conf
DOI :
10.1109/HPCC-CSS-ICESS.2015.53
Filename :
7336402
Link To Document :
بازگشت