• DocumentCode
    692858
  • Title

    General transformations for GPU execution of tree traversals

  • Author

    Goldfarb, Michael ; Youngjoon Jo ; Kulkarni, Milind

  • Author_Institution
    Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
  • fYear
    2013
  • fDate
    17-22 Nov. 2013
  • Firstpage
    1
  • Lastpage
    12
  • Abstract
    With the advent of programmer-friendly GPU computing environments, there has been much interest in offloading workloads that can exploit the high degree of parallelism available on modern GPUs. Exploiting this parallelism and optimizing for the GPU memory hierarchy is well-understood for regular applications that operate on dense data structures such as arrays and matrices. However, there has been significantly less work in the area of irregular algorithms and even less so when pointer-based dynamic data structures are involved. Recently, irregular algorithms such as Barnes-Hut and kd-tree traversals have been implemented on GPUs, yielding significant performance gains over CPU implementations. However, the implementations often rely on exploiting application-specific semantics to get acceptable performance. We argue that there are general-purpose techniques for implementing irregular algorithms on GPUs that exploit similarities in algorithmic structure rather than application-specific knowledge. We demonstrate these techniques on several tree traversal algorithms, achieving speedups of up to 38× over 32-thread CPU versions.
  • Keywords
    multiprocessing systems; parallel algorithms; tree data structures; Barnes-Hut tree traversal algorithm; CPU versions; GPU execution; GPU memory hierarchy; algorithmic structure; application-specific semantics; general-purpose techniques; irregular algorithms; kd-tree traversal algorithm; offloading workloads; pointer-based dynamic data structures; programmer-friendly GPU computing environments; Abstracts; Educational institutions; Legged locomotion; Optimization; Propulsion; Semantics; Servers; GPU; irregular programs; tree traversals; vectorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2013 International Conference for
  • Conference_Location
    Denver, CO
  • Print_ISBN
    978-1-4503-2378-9
  • Type

    conf

  • DOI
    10.1145/2503210.2503223
  • Filename
    6877443