• DocumentCode
    3429339
  • Title

    Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes

  • Author

    Subramoni, Hari ; Potluri, Sreeram ; Kandalla, Krishna ; Barth, B. ; Vienne, Jerome ; Keasler, Jeff ; Tomko, Karen ; Schulz, K. ; Moody, Adam ; Panda, Dhabaleswar K.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
  • fYear
    2012
  • fDate
    10-16 Nov. 2012
  • Firstpage
    1
  • Lastpage
    12
  • Abstract
    Over the last decade, InfiniBand has become an increasingly popular interconnect for deploying modern supercomputing systems. However, there exists no detection service that can discover the underlying network topology in a scalable manner and expose this information to runtime libraries and users of the high performance computing systems in a convenient way. In this paper, we design a novel and scalable method to detect the InfiniBand network topology by using Neighbor-Joining techniques (NJ). To the best of our knowledge, this is the first instance where the neighbor joining algorithm has been applied to solve the problem of detecting InfiniBand network topology. We also design a network-topology-aware MPI library that takes advantage of the network topology service. The library places processes taking part in the MPI job in a network-topology-aware manner with the dual aim of increasing intra-node communication and reducing the long distance inter-node communication across the InfiniBand fabric.
  • Keywords
    application program interfaces; message passing; network topology; parallel processing; InfiniBand fabric; InfiniBand network topology; InfiniBand topology service; NJ; high performance computing systems; internode communication; intra-node communication; neighbor-joining techniques; network-topology-aware MPI library; process network-topology-aware placement; runtime libraries; supercomputing systems; Algorithm design and analysis; Clustering algorithms; Fabrics; Libraries; Network topology; Routing; Topology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    2167-4329
  • Print_ISBN
    978-1-4673-0805-2
  • Type

    conf

  • DOI
    10.1109/SC.2012.47
  • Filename
    6468515