• DocumentCode
    2141630
  • Title

    Auto-Tuning of Data Communication on Heterogeneous Systems

  • Author

    Jorda, Marc ; Tanasic, Ivan ; Cabezas, Javier ; Vilanova, Lluis ; Gelado, Isaac ; Navarro, Nacho

  • Author_Institution
    Barcelona Supercomput. Center, Barcelona, Spain
  • fYear
    2013
  • fDate
    26-28 Sept. 2013
  • Firstpage
    135
  • Lastpage
    140
  • Abstract
    Heterogeneous systems formed by traditional CPUs and compute accelerators, such as GPUs, are becoming widely used to build modern supercomputers. However, many different system topologies (i.e., how CPUs, accelerators, and I/O devices are interconnected) are being deployed. Each system organization presents different trade-offs when transferring data between CPUs, accelerators, and nodes within a cluster, requiring different software implementations to achieve optimal data communication bandwidth. In this paper we explore the potential impact of two optimizations to achieve optimal data transfer bandwidth: topology-aware process placement policies, and double-buffering. We design a set of experiments to evaluate all possible alternatives, and run each of them on different hardware configurations. We show that optimal data transfer mechanisms depend on both the hardware topology and the application dataset size. Our experimental evaluation shows that auto-tuning applications to match the hardware topology, and to find the best double-buffering configuration can improve the data transfers bandwidth up to 70% for local communication and is key to achieve optimal bandwidth in remote communication for data transfers larger than 128KB.
  • Keywords
    data communication; optimisation; parallel processing; topology; HPC environments; auto-tuning applications; data communication; double-buffering configuration; hardware topology; heterogeneous systems; optimal data communication bandwidth; optimal data transfer bandwidth; optimizations; topology-aware process placement policies; Bandwidth; Data transfer; Graphics processing units; Hardware; Peer-to-peer computing; Topology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Embedded Multicore Socs (MCSoC), 2013 IEEE 7th International Symposium on
  • Conference_Location
    Tokyo
  • Type

    conf

  • DOI
    10.1109/MCSoC.2013.40
  • Filename
    6657919