• DocumentCode
    3501416
  • Title

    Designing High Performance and Scalable MPI Intra-node Communication Support for Clusters

  • Author

    Chai, Lei ; Hartono, Albert ; Panda, Dhabaleswar K.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH
  • fYear
    2006
  • fDate
    25-28 Sept. 2006
  • Firstpage
    1
  • Lastpage
    10
  • Abstract
    As new processor and memory architectures advance, clusters start to be built from larger SMP systems, which makes MPI intra-node communication a critical issue in high performance computing. This paper presents a new design for MPI intra-node communication that aims to achieve both high performance and good scalability in a cluster environment. The design distinguishes small and large messages and handles them differently to minimize the data transfer overhead for small messages and the memory space consumed by large messages. Moreover, the design utilizes the cache efficiently and requires no locking mechanisms to achieve optimal performance even with large system size. This paper also explores various optimization strategies to reduce polling overhead and maintain data locality. We have evaluated our design on NUMA and dual core NUMA (non-uniform memory access) systems. The experimental results on NUMA system show that the new design can improve MPI intra-node latency by up to 35% and bandwidth by up to 50% compared to MVAPICH. While running the bandwidth benchmark, the measured L2 cache miss rate is reduced by half. The new design also improves the performance of MPI collective calls by up to 25%. The results on dual core NUMA system show that the new design can achieve 0.48 musec in CMP latency
  • Keywords
    message passing; shared memory systems; workstation clusters; L2 cache miss rate; bandwidth benchmark; cluster computing; dual core NUMA system; high performance MPI intra-node communication support; larger SMP systems; memory architectures; multicore processor; nonuniform memory access systems; scalable MPI intra-node communication support; workstation clusters; Bandwidth; Computer architecture; Computer science; Delay; Design engineering; High performance computing; Memory architecture; Multicore processing; Scalability; Sun; Cluster Computing; Intra-node Communication; MPI; Multi-core Processor; Non-Uniform Memory Access (NUMA);
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing, 2006 IEEE International Conference on
  • Conference_Location
    Barcelona
  • ISSN
    1552-5244
  • Print_ISBN
    1-4244-0327-8
  • Electronic_ISBN
    1552-5244
  • Type

    conf

  • DOI
    10.1109/CLUSTR.2006.311850
  • Filename
    4100356