• DocumentCode
    1902625
  • Title

    Hidp: A hierarchical data parallel language

  • Author

    Yongpeng Zhang ; Mueller, Frank

  • Author_Institution
    North Carolina State Univ., Raleigh, NC, USA
  • fYear
    2013
  • fDate
    23-27 Feb. 2013
  • Firstpage
    1
  • Lastpage
    11
  • Abstract
    Problem domains are commonly decomposed hierarchically to fully utilize parallel resources in modern microprocessors. Such decompositions can be provided as library routines, written by experienced experts, for general algorithmic patterns. But such APIs tend to be constrained to certain architectures or data sizes. Integrating them with application code is often an unnecessarily daunting task, especially when these routines need to be closely coupled with user code to achieve better performance. This paper contributes HiDP, a high-level hierarchical data parallel language. The purpose of HiDP is to improve the coding productivity of integrating hierarchical data parallelism without significant loss of performance. HiDP is a source-to-source compiler that converts a very concise data parallel language into CUDA C++ source code. Internally, it performs necessary analysis to compose user code with efficient and architecture-aware code snippets. This paper discusses various aspects of HiDP systematically: the language, the compiler and the run-time system with built-in tuning capabilities. They enable HiDP users to express algorithms in less code than low-level SDKs require for native platforms. HiDP also exposes abundant computing resources of modern parallel architectures. Improved coding productivity tends to come with a sacrifice in performance. Yet, experimental results show that the generated code delivers performance very close to handcrafted native GPU code.
  • Keywords
    C++ language; application program interfaces; graphics processing units; microprocessor chips; parallel architectures; parallel languages; program compilers; source coding; APIs; CUDA C++ source code; HiDP; algorithmic patterns; architecture-aware code snippets; built-in tuning capabilities; coding productivity; computing resources; data parallel language; data sizes; handcrafted native GPU code; hierarchical data parallelism; high-level hierarchical data parallel language; library routines; low-level SDKs; microprocessors; parallel architectures; parallel resources; run-time system; source-to-source compiler; user code; Arrays; Graphics processing units; Kernel; Libraries; Parallel processing; Shape; Synchronization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Code Generation and Optimization (CGO), 2013 IEEE/ACM International Symposium on
  • Conference_Location
    Shenzhen
  • Print_ISBN
    978-1-4673-5524-7
  • Type

    conf

  • DOI
    10.1109/CGO.2013.6494994
  • Filename
    6494994