Title :
Adaptive connection management for scalable MPI over InfiniBand
Author :
Yu, Weikuan ; Gao, Qi ; Panda, Dhabaleswar K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Ohio State Univ., USA
Abstract :
Supporting scalable and efficient parallel programs is a major challenge in parallel computing with the widespread adoption of large-scale computer clusters and supercomputers. One of the pronounced scalability challenges is the management of connections between parallel processes, especially over connection-oriented interconnects such as VIA and InfiniBand. In this paper, we take on the challenge of designing efficient connection management for parallel programs over InfiniBand clusters. We propose adaptive connection management (ACM) to dynamically control the establishment of InfiniBand reliable connections (RC) based on the communication frequency between MPI processes. We have investigated two different ACM algorithms: an on-demand algorithm that starts with no InfiniBand RC connections; and a partial static algorithm with only 2 * logN number of InfiniBand RC connections initially. We have designed and implemented both ACM algorithms in MVAPICH to study their benefits. Two mechanisms have been exploited for the establishment of new RC connections: one using InfiniBand unreliable datagram and the other using InfiniBand connection management. For both mechanisms, MPI communication issues, such as progress rules, reliability and race conditions are handled to ensure efficient and lightweight connection management. Our experimental results indicate that ACM algorithms can benefit parallel programs in terms of the process initiation time, the number of active connections, and the resource usage. For parallel programs on a 16-node cluster, they can reduce the process initiation time by 15% and the initial memory usage by 18%.
Keywords :
message passing; parallel programming; workstation clusters; InfiniBand cluster; InfiniBand connection management; InfiniBand reliable connection; InfiniBand unreliable datagram; active connection; adaptive connection management; communication frequency; connection-oriented interconnects; large-scale computer cluster; ondemand algorithm; parallel computing; parallel program; partial static algorithm; process initiation time; scalable MPI; supercomputer; Adaptive control; Clustering algorithms; Communication system control; Concurrent computing; Large-scale systems; Parallel processing; Programmable control; Radio control; Scalability; Supercomputers;
Conference_Titel :
Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International
Print_ISBN :
1-4244-0054-6
DOI :
10.1109/IPDPS.2006.1639338