DocumentCode :
3041859
Title :
Host-assisted zero-copy remote memory access communication on InfiniBand
Author :
Tipparaju, V. ; Santhanaraman, G. ; Nieplocha, J. ; Panda, D.K.
Author_Institution :
Pacific Northwest Nat. Lab., Washington, DC, USA
fYear :
2004
fDate :
26-30 April 2004
Firstpage :
31
Abstract :
Summary form only given. The remote memory access (RMA) is an increasingly important communication model due to its excellent potential for overlapping communication and computations and achieving high performance on modern networks with RDMA hardware such as Infiniband. RMA plays a vital role in supporting the emerging global address space programming models. We describe how RMA can be implemented efficiently over InfiniBand. The capabilities not offered directly by the Infiniband verb layer can be implemented efficiently using the novel host-assisted approach while achieving zero-copy communication and supporting an excellent overlap of computation with communication. For contiguous data we are able to achieve a small message latency of 6μs and a peak bandwidth of 830 MB/s for ´put´ and a small message latency of 12μs and a peak bandwidth of 765 Megabytes for ´get´. These numbers are almost as good as the performance of the native VAPI layer. For the noncontiguous data, the host assisted approach can deliver bandwidth close to that for the contiguous data. We also demonstrate the superior tolerance of host-assisted data-transfer operations to CPU intensive tasks due to minimum host involvement in our approach as compared to the traditional host-based approach. Our implementation also supports a very high degree of overlap of computation and communication. 99% overlap for contiguous and up to 95% for noncontiguous in case of large message sizes were achieved. The NAS MG and matrix multiplication benchmarks were used to validate effectiveness of our approach, and demonstrated excellent overall performance.
Keywords :
bandwidth allocation; benchmark testing; electronic data interchange; fault tolerance; message passing; CPU intensive tasks; InfiniBand RDMA hardware; Infiniband verb layer; NAS MG benchmarks; VAPI layer; contiguous data; global address space programming models; host-assisted data-transfer operations; host-assisted zero-copy remote memory access communication; matrix multiplication benchmarks; message latency; peak bandwidth; Access protocols; Bandwidth; Computer networks; Context; Data structures; Delay; Distributed processing; Laboratories; Libraries; Message passing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Processing Symposium, 2004. Proceedings. 18th International
Print_ISBN :
0-7695-2132-0
Type :
conf
DOI :
10.1109/IPDPS.2004.1302943
Filename :
1302943
Link To Document :
بازگشت