• DocumentCode
    2193822
  • Title

    Cheetah: A Framework for Scalable Hierarchical Collective Operations

  • Author

    Graham, Richard ; Venkata, Manjunath Gorentla ; Ladd, Joshua ; Shamis, Pavel ; Rabinovitz, Ishai ; Filipov, Vasily ; Shainer, Gilad

  • Author_Institution
    Oak Ridge Nat. Lab., Oak Ridge, TN, USA
  • fYear
    2011
  • fDate
    23-26 May 2011
  • Firstpage
    73
  • Lastpage
    83
  • Abstract
    Collective communication operations, used by many scientific applications, tend to limit overall parallel application performance and scalability. Computer systems are becoming more heterogeneous with increasing node and core-per-node counts. Also, a growing number of data-access mechanisms, of varying characteristics, are supported within a single computer system. We describe a new hierarchical collective communication framework that takes advantage of hardware-specific data-access mechanisms. It is flexible, with run-time hierarchy specification, and sharing of collective communication primitives between collective algorithms. Data buffers are shared between levels in the hierarchy reducing collective communication management overhead. We have implemented several versions of the Message Passing Interface (MPI) collective operations, MPI Barrier() and MPI Bcast(), and run experiments using up to 49, 152 processes on a Cray XT5, and a small InfiniBand based cluster. At 49, 152 processes our barrier implementation outperforms the optimized native implementation by 75%. 32 Byte and one Mega-Byte broadcasts outperform it by 62% and 11%, respectively, with better scalability characteristics. Improvements relative to the default Open MPI implementation are much larger.
  • Keywords
    data analysis; information retrieval; message passing; optimisation; parallel processing; MPI Bcast; MPI barrier; barrier implementation; collective algorithm; computer system; core-per-node counts; data buffering; hardware-specific data access mechanism; hierarchy reducing collective communication management; message passing interface; open MPI implementation; optimized native implementation; parallel application performance; run time hierarchy specification; scalability characteristics; scalable hierarchical collective operation; scientific application; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Memory management; Network topology; Sockets; Topology; Collectives; Framework; Hierarchy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM International Symposium on
  • Conference_Location
    Newport Beach, CA
  • Print_ISBN
    978-1-4577-0129-0
  • Electronic_ISBN
    978-0-7695-4395-6
  • Type

    conf

  • DOI
    10.1109/CCGrid.2011.42
  • Filename
    5948598