• DocumentCode
    3585397
  • Title

    XcalableACC: Extension of XcalableMP PGAS Language Using OpenACC for Accelerator Clusters

  • Author

    Nakao, Masahiro ; Murai, Hitoshi ; Shimosaka, Takenori ; Tabuchi, Akihiro ; Hanawa, Toshihiro ; Kodama, Yuetsu ; Boku, Taisuke ; Sato, Mitsuhisa

  • Author_Institution
    RIKEN Adv. Inst. for Comput. Sci., Kobe, Japan
  • fYear
    2014
  • Firstpage
    27
  • Lastpage
    36
  • Abstract
    The present paper introduces the XcalableACC (XACC) programming model, which is a hybrid model of the XcalableMP (XMP) Partitioned Global Address Space (PGAS) language and OpenACC. XACC defines directives that enable programmers to mix XMP and OpenACC directives in order to develop applications that can use accelerator clusters with ease. Moreover, in order to improve the performance of stencil applications, the Omni XACC compiler provides functions that can transfer a halo region on accelerator memory via Tightly Coupled Accelerators (TCA), which is a proprietary network for transferring data directly among accelerators. In the present paper, we evaluate the productivity and the performance of XACC through implementations of the HIMENO Benchmark. The results show that thanks to the productivity improvements, XACC requires less than half the source lines of code compare to a combination of Message Passing Interface (MPI) and OpenACC, which is commonly used together as a typical programming model. As a result of these performance improvements, XACC using TCA achieved up to 2.7 times faster performance than could be obtained via the combination of OpenACC and MPI programming model using GPUDirect RDMA over InfiniBand.
  • Keywords
    distributed memory systems; message passing; parallel programming; source code (software); GPUDirect RDMA; HIMENO benchmark; InfiniBand; MPI programming model; Omni XACC compiler; OpenACC; TCA; XACC programming model; XMP PGAS language; XMP Partitioned Global Address Space language; XcalableACC; XcalableMP PGAS language; accelerator clusters; accelerator memory; halo region; message passing interface; source code; stencil applications; tightly coupled accelerators; Arrays; Computational modeling; Graphics processing units; Indexes; Programming; Synchronization; Syntactics; Design language; Development Compiler; Accelerator Cluster; Partitioned Global Address Space Language;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Accelerator Programming using Directives (WACCPD), 2014 First Workshop on
  • Type

    conf

  • DOI
    10.1109/WACCPD.2014.6
  • Filename
    7081675