• DocumentCode
    3697205
  • Title

    CUDA Grid-Level Task Progression Algorithms

  • Author

    Christos Kartsaklis;Wayne Joubert;Oscar R. Hernandez;Markus Eisenbach;Wael R. Elwasif;David E. Bernholdt

  • Author_Institution
    Oak Ridge Nat. Lab., Oak Ridge, TN, USA
  • fYear
    2015
  • Firstpage
    1628
  • Lastpage
    1632
  • Abstract
    Tasking is a prominent parallel programming model. In this paper we conduct a first study into the feasibility of task-parallel execution at the CUDA grid, rather than the stream/kernel level, for regular, fixed in-out dependency task graphs, similar to those found in wavefront computational patterns, making the findings broadly applicable. We propose and evaluate three CUDA task progression algorithms, where threadblocks cooperatively process the task graph, and argue about their performance in terms of tasking throughput, atomics and memory IO overheads. Our initial results demonstrate a throughput of 38 million tasks/second on a Kepler K20X architecture.
  • Keywords
    "Graphics processing units","Instruction sets","Computational modeling","Parallel processing","Runtime","Radiation detectors","Kernel"
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
  • Type

    conf

  • DOI
    10.1109/HPCC-CSS-ICESS.2015.53
  • Filename
    7336402