• DocumentCode
    2845923
  • Title

    Performance of HPC Middleware over InfiniBand WAN

  • Author

    Narravula, S. ; Subramoni, H. ; Lai, P. ; Noronha, R. ; Panda, D.K.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH
  • fYear
    2008
  • fDate
    9-12 Sept. 2008
  • Firstpage
    304
  • Lastpage
    311
  • Abstract
    High performance interconnects such as InfiniBand (IB)have enabled large scale deployments of High Performance Computing (HPC) systems. High performance communication and IO middleware such as MPI and NFS over RDMA have also been redesigned to leverage the performance of these modern interconnects. With the advent of long haul InfiniBand (IB WAN), IB applications now have inter-cluster reaches. While this technology is intended to enable high performance network connectivity across WAN links,it is important to study and characterize the actual performance that the existing IB middleware achieve in these emerging IB WAN scenarios. In this paper, we study and analyze the performance characteristics of the following three HPC middleware: (i)IPoIB (IP traffic over IB), (ii) MPI and (iii) NFS over RDMA. We utilize the Obsidian IB WAN routers for inter-cluster connectivity. Our results show that many of the applications absorb smaller network delays fairly well. However, most approaches get severely impacted in high delay scenarios. Further, communication protocols need to be optimized in higher delay scenarios to improve the performance. In this paper, we propose several such optimizations to improve communication performance. Our experimental results show that techniques such as WAN-aware protocols, transferring data using large messages (message coalescing) and using parallel data streams can improve the communication performance (up to 50%) in high delay scenarios. Overall, these results demonstrate that IB WAN technologies can enable cluster-of-clusters architecture as a feasible platform for HPC systems.
  • Keywords
    middleware; protocols; wide area networks; workstation clusters; HPC middleware; InfiniBand WAN; WAN-aware protocols; cluster-of-clusters architecture; communication protocols; high performance computing systems; high performance network connectivity; intercluster connectivity; network delays; parallel data streams; Costs; Delay; Fabrics; High performance computing; Libraries; Middleware; Parallel processing; Performance analysis; Protocols; Wide area networks; Cluster-of-Clusters; IPoIB; InfiniBand; InfiniBand WAN; MPI; MVAPICH2; NFS; Obsidian Longbow XR;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2008. ICPP '08. 37th International Conference on
  • Conference_Location
    Portland, OR
  • ISSN
    0190-3918
  • Print_ISBN
    978-0-7695-3374-2
  • Electronic_ISBN
    0190-3918
  • Type

    conf

  • DOI
    10.1109/ICPP.2008.75
  • Filename
    4625863