DocumentCode
3429339
Title
Design of a scalable InfiniBand topology service to enable network-topology-aware placement of processes
Author
Subramoni, Hari ; Potluri, Sreeram ; Kandalla, Krishna ; Barth, B. ; Vienne, Jerome ; Keasler, Jeff ; Tomko, Karen ; Schulz, K. ; Moody, Adam ; Panda, Dhabaleswar K.
Author_Institution
Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA
fYear
2012
fDate
10-16 Nov. 2012
Firstpage
1
Lastpage
12
Abstract
Over the last decade, InfiniBand has become an increasingly popular interconnect for deploying modern supercomputing systems. However, there exists no detection service that can discover the underlying network topology in a scalable manner and expose this information to runtime libraries and users of the high performance computing systems in a convenient way. In this paper, we design a novel and scalable method to detect the InfiniBand network topology by using Neighbor-Joining techniques (NJ). To the best of our knowledge, this is the first instance where the neighbor joining algorithm has been applied to solve the problem of detecting InfiniBand network topology. We also design a network-topology-aware MPI library that takes advantage of the network topology service. The library places processes taking part in the MPI job in a network-topology-aware manner with the dual aim of increasing intra-node communication and reducing the long distance inter-node communication across the InfiniBand fabric.
Keywords
application program interfaces; message passing; network topology; parallel processing; InfiniBand fabric; InfiniBand network topology; InfiniBand topology service; NJ; high performance computing systems; internode communication; intra-node communication; neighbor-joining techniques; network-topology-aware MPI library; process network-topology-aware placement; runtime libraries; supercomputing systems; Algorithm design and analysis; Clustering algorithms; Fabrics; Libraries; Network topology; Routing; Topology;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
Conference_Location
Salt Lake City, UT
ISSN
2167-4329
Print_ISBN
978-1-4673-0805-2
Type
conf
DOI
10.1109/SC.2012.47
Filename
6468515
Link To Document