• DocumentCode
    228704
  • Title

    Enabling Efficient Multithreaded MPI Communication through a Library-Based Implementation of MPI Endpoints

  • Author

    Sridharan, Sridha ; Dinan, James ; Kalamkar, Dhiraj D.

  • fYear
    2014
  • fDate
    16-21 Nov. 2014
  • Firstpage
    487
  • Lastpage
    498
  • Abstract
    Modern high-speed interconnection networks are designed with capabilities to support communication from multiple processor cores. The MPI endpoints extension has been proposed to ease process and thread count tradeoffs by enabling multithreaded MPI applications to efficiently drive independent network communication. In this work, we present the first implementation of the MPI endpoints interface and demonstrate the first applications running on this new interface. We use a novel library-based design that can be layered on top of any existing, production MPI implementation. Our approach uses proxy processes to isolate threads in an MPI job, eliminating threading overheads within the MPI library and allowing threads to achieve process-like communication performance. We evaluate the performance advantages of our implementation through several benchmarks and kernels. Performance results for the Lattice QCD Dslash kernel indicate that endpoints provides up to 2.9× improvement in communication performance and 1.87× overall performance improvement over a highly optimized hybrid MPI+OpenMP baseline on 128 processors.
  • Keywords
    application program interfaces; message passing; multi-threading; multiprocessing systems; multiprocessor interconnection networks; software libraries; MPI endpoints extension; MPI endpoints interface; MPI job; MPI library; MPI+OpenMP baseline; high-speed interconnection network; independent network communication; lattice QCD Dslash kernel; library-based design; library-based implementation; multiple processor core; multithreaded MPI application; multithreaded MPI communication; performance evaluation; process-like communication performance; production MPI implementation; thread count tradeoff; threading overhead; Arrays; Context; Kernel; Libraries; Message systems; Parallel programming; Semantics; Endpoints; Hybrid Parallel Programming; MPI;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for
  • Conference_Location
    New Orleans, LA
  • Print_ISBN
    978-1-4799-5499-5
  • Type

    conf

  • DOI
    10.1109/SC.2014.45
  • Filename
    7013027