• DocumentCode
    1920809
  • Title

    CuNesl: Compiling Nested Data-Parallel Languages for SIMT Architectures

  • Author

    Zhang, Yongpeng ; Mueller, Frank

  • Author_Institution
    North Carolina State Univ., Raleigh, NC, USA
  • fYear
    2012
  • fDate
    10-13 Sept. 2012
  • Firstpage
    340
  • Lastpage
    349
  • Abstract
    Data-parallel languages feature fine-grained parallel primitives that can be supported by compilers targeting modern many-core architectures where data parallelism must be exploited to fully utilize the hardware. Previous research has focused on converting data-parallel languages for SIMD (single instruction multiple data) architectures. However, directly applying them to today\´s SIMT (single instruction multiple thread) architectures does not guarantee competitive performance. We propose cuNesl, a compiler framework to translate and optimize NESL into parallel CUDA programs for SIMT architectures. By converting recursive calls into while loops, we ensure that the hierarchical execution model in GPUs can be exploited on the "flattened" code. The performance gap between our auto-generated CUDA code and hand-crafted CUDA code thus narrows while programmability is greatly increased. Our compiler outperforms handwritten parallel code running on CPUs in terms of both execution time and programmability.
  • Keywords
    multiprocessing systems; parallel architectures; parallel languages; parallel programming; program compilers; CuNesl; GPU; SIMT architectures; auto-generated CUDA code; compiler framework; data parallelism; execution time; fine-grained parallel primitives; hand-crafted CUDA code; handwritten parallel code; hierarchical execution model; modern many-core architectures; nested data-parallel languages; parallel CUDA programs; programmability; single instruction multiple data architectures; Arrays; Graphics processing unit; Hardware; Instruction sets; Kernel; Parallel processing; Synchronization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing (ICPP), 2012 41st International Conference on
  • Conference_Location
    Pittsburgh, PA
  • ISSN
    0190-3918
  • Print_ISBN
    978-1-4673-2508-0
  • Type

    conf

  • DOI
    10.1109/ICPP.2012.21
  • Filename
    6337595