Title :
Cheetah: A Framework for Scalable Hierarchical Collective Operations
Author :
Graham, Richard ; Venkata, Manjunath Gorentla ; Ladd, Joshua ; Shamis, Pavel ; Rabinovitz, Ishai ; Filipov, Vasily ; Shainer, Gilad
Author_Institution :
Oak Ridge Nat. Lab., Oak Ridge, TN, USA
Abstract :
Collective communication operations, used by many scientific applications, tend to limit overall parallel application performance and scalability. Computer systems are becoming more heterogeneous with increasing node and core-per-node counts. Also, a growing number of data-access mechanisms, of varying characteristics, are supported within a single computer system. We describe a new hierarchical collective communication framework that takes advantage of hardware-specific data-access mechanisms. It is flexible, with run-time hierarchy specification, and sharing of collective communication primitives between collective algorithms. Data buffers are shared between levels in the hierarchy reducing collective communication management overhead. We have implemented several versions of the Message Passing Interface (MPI) collective operations, MPI Barrier() and MPI Bcast(), and run experiments using up to 49, 152 processes on a Cray XT5, and a small InfiniBand based cluster. At 49, 152 processes our barrier implementation outperforms the optimized native implementation by 75%. 32 Byte and one Mega-Byte broadcasts outperform it by 62% and 11%, respectively, with better scalability characteristics. Improvements relative to the default Open MPI implementation are much larger.
Keywords :
data analysis; information retrieval; message passing; optimisation; parallel processing; MPI Bcast; MPI barrier; barrier implementation; collective algorithm; computer system; core-per-node counts; data buffering; hardware-specific data access mechanism; hierarchy reducing collective communication management; message passing interface; open MPI implementation; optimized native implementation; parallel application performance; run time hierarchy specification; scalability characteristics; scalable hierarchical collective operation; scientific application; Algorithm design and analysis; Classification algorithms; Clustering algorithms; Memory management; Network topology; Sockets; Topology; Collectives; Framework; Hierarchy;
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2011 11th IEEE/ACM International Symposium on
Conference_Location :
Newport Beach, CA
Print_ISBN :
978-1-4577-0129-0
Electronic_ISBN :
978-0-7695-4395-6
DOI :
10.1109/CCGrid.2011.42