مرکز منطقه ای اطلاع رساني علوم و فناوري - High Performance MPI on IBM 12x InfiniBand Architecture

DocumentCode :

2789543

Title :

High Performance MPI on IBM 12x InfiniBand Architecture

Author :

Vishnu, Abhinav ; Benton, Brad ; Panda, Dhabaleswar K.

Author_Institution :

Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH

fYear :

2007

fDate :

26-30 March 2007

Firstpage :

Lastpage :

Abstract :

InfiniBand is becoming increasingly popular in the area of cluster computing due to its open standard and high performance. I/O interfaces like PCI-express and GX+ are being introduced as next generation technologies to drive InfiniBand with very high throughput. HCAs with throughput of 8x on PCI-express have become available. Recently, support for HCAs with 12x throughput on GX+ has been announced. In this paper, we design a message passing interface (MPI) on IBM 12x dual-port HCAs, which consist of multiple send/recv engines per port. We propose and study the impact of various communication scheduling policies (binding, striping and round robin). Based on this study, we present a new policy, EPC (enhanced point-to-point and collective), which incorporates different kinds of communication patterns; point-to-point (blocking, non-blocking) and collective communication, for data transfer. We implement our design and evaluate it with micro-benchmarks, collective communication and NAS parallel benchmarks. Using EPC on a 12x InfiniBand cluster with one HCA and one port, we can improve the performance by 41% with pingpong latency test and 63-65% with the unidirectional and bi-directional bandwidth tests, compared with the default single-rail MPI implementation. Our evaluation on NAS parallel benchmarks shows an improvement of 7-13% in execution time for integer sort and Fourier transform.

Keywords :

Fourier transforms; application program interfaces; computer architecture; message passing; peripheral interfaces; Fourier transform; HCA; IBM 12x InfiniBand architecture; PCI-express; application program interface; cluster computing; communication scheduling policy; data transfer; high performance MPI; message passing; peripheral interface; Bandwidth; Benchmark testing; Bidirectional control; Computer architecture; Delay; Engines; High performance computing; Message passing; Round robin; Throughput;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International

Conference_Location :

Long Beach, CA

Print_ISBN :

1-4244-0910-1

Electronic_ISBN :

1-4244-0910-1

Type :

conf

DOI :

10.1109/IPDPS.2007.370407

Filename :

4228135

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2789543