DocumentCode :
1661542
Title :
Fast collective operations using shared and remote memory access protocols on clusters
Author :
Tipparaju, Vinod ; Nieplocha, Jarek ; Panda, Dhabaleswar
fYear :
2003
Abstract :
This paper describes a novel methodology for implementing a common set of collective communication operations on clusters based on symmetric multiprocessor (SMP) nodes. Called Shared-Remote-Memory collectives, or SRM, our approach replaces the point-to-point message passing, traditionally used in implementation of collective message-passing operations, with a combination of shared and remote memory access (RMA) protocols that are used to implement semantics of the collective operations directly. Appropriate embedding of the communication graphs in a cluster maximizes the use of shared memory and reduces network communication. Substantial performance improvements are achieved over the highly optimized commercial IBM implementation and the open-source MPICH implementation of MPI across a wide range of message sizes on the IBM SP. For example, depending on the message size and number of processors, SRM implementation of broadcast, reduce, and barrier outperforms IBM MPI_Bcast by 27-84%, MPI_Reduce by 24-79%, and MPI_Barrier by 73% on 256 processors, respectively.
Keywords :
interrupts; memory protocols; message passing; multiprocessing systems; workstation clusters; MPICH implementation; cluster computing; collective communication operations; communication graphs; point-to-point message passing; remote memory access protocols; semantics; shared memory access protocols; shared-remote-memory collectives; symmetric multiprocessor nodes; Access protocols; Broadcasting; Concurrent computing; Distributed computing; Hardware; Iterative algorithms; Message passing; Performance gain; Scalability; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2003. Proceedings. International
ISSN :
1530-2075
Print_ISBN :
0-7695-1926-1
Type :
conf
DOI :
10.1109/IPDPS.2003.1213188
Filename :
1213188
Link To Document :
بازگشت