Fast barrier synchronization for InfiniBand/spl trade/

Author

Hoefler, Torsten ; Mehlan, Torsten ; Mietke, Frank ; Rehm, Wolfgang

Author_Institution

Dept. of Comput. Sci., Chemnitz Univ. of Technol.

fYear

2006

fDate

25-29 April 2006

Abstract

The MPI_Barrier() call can be crucial for several applications and has been target of different optimizations since several decades. The best solution to the barrier problem scales with O(log₂N) and uses the dissemination principle. A new method using an enhanced dissemination principle and inherent network parallelism is demonstrated in this paper. The new approach was able to speedup the barrier performance by 40% in relation to the best published algorithm. It is shown that it is possible to leverage the inherent hardware parallelism inside the InfiniBandtrade network to lower the latency of the MPI-Barrier() operation without additional costs. The principle of sending multiple messages in (pseudo-) parallel can be implemented into a well known algorithm to decrease the number of rounds and speed the overall operation up

Keywords

computational complexity; message passing; parallel processing; synchronisation; InfiniBand; MPI_Barrier() call; computational complexity; dissemination principle; fast barrier synchronization; inherent network parallelism; Application software; Bandwidth; Chemical technology; Clustering algorithms; Computer science; Costs; Counting circuits; Delay; Hardware; Parallel processing;

fLanguage

English

Publisher

ieee

Conference_Titel

Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International

Conference_Location

Rhodes Island

Print_ISBN

1-4244-0054-6

Type

conf

DOI

10.1109/IPDPS.2006.1639561

Filename

1639561