Title :
High performance implementation of MPI derived datatype communication over InfiniBand
Author :
Wu, Jiesheng ; Wyckoff, Pete ; Panda, Dhabaleswar
Author_Institution :
Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH, USA
Abstract :
Summary form only given. In this paper, a systematic study of two main types of approach for MPI datatype communication (pack/unpack-based approaches and copy-reduced approaches) is carried out on the InfiniBand network. We focus on overlapping packing, network communication, and unpacking in the pack/unpack-based approaches. We use RDMA operations to avoid packing and/or unpacking in the copy-reduced approaches. Four schemes (buffer-centric segment pack/unpack, RDMA write gather with unpack, pack with RDMA read scatter, and multiple RDMA writes have been proposed. Three of them have been implemented and evaluated based on one MPI implementation over InfiniBand. Performance results of a vector microbenchmark demonstrate that latency is improved by a factor of up to 3.4 and bandwidth by a factor of up to 3.6 compared to the current datatype communication implementation. Collective operations like MPI Alltoall are demonstrated to benefit. A factor of up to 2.0 improvement has been seen in our measurements of those collective operations on an 8-node system.
Keywords :
data communication; message passing; performance evaluation; 8-node system; InfiniBand network; MPI derived datatype communication; copy-reduced approach; microbenchmark; network communication; overlapping packing; performance evaluation; Bandwidth; Computer networks; Costs; Data communication; Delay; High performance computing; Information science; Message passing; Scattering; Supercomputers;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International
Print_ISBN :
0-7695-2132-0
DOI :
10.1109/IPDPS.2004.1302917