DocumentCode :
3452783
Title :
Application-bypass broadcast in MPICH over GM
Author :
Buntinas, Darius ; Panda, Dhabaleswar K. ; Brightwe, Ron
Author_Institution :
Dept. of Comput. & Inf. Sci., Ohio State Univ., Columbus, OH, USA
fYear :
2003
fDate :
12-15 May 2003
Firstpage :
2
Lastpage :
9
Abstract :
Processes of a parallel program can become unsynchronized, or skewed, during the course of running an application. Processes can become skewed as a result of unbalanced or asymmetric rode, or through the use of heterogeneous systems, where nodes in the system have different performance characteristics, as well as random, unpredictable effects such as the processes not being started at exactly the same time, or processors receiving interrupts during computation. Geographically distributed systems may have more severe skew because of variable communication times. Such skew can have a significant impact on the performance of collective communication operations which impose an implicit synchronization. The broadcast operation in MPICH is one such operation. An application-bypass broadcast operation is one which does not depend on the application running at a process to make progress. Such an operation would not be as sensitive to process skew. This paper describes the design and implementation of an application-bypass broadcast operation. We evaluated the implementation and find a factor of improvement of up to 16 for application-bypass broadcast compared to non-application-bypass broadcast when processes are skewed. Furthermore we see that as the system size increases, the effects of skew on non-application-bypass broadcast also increase. The application-bypass broadcast is much less sensitive to process skew which makes it more scalable than the non-application-bypass broadcast operation.
Keywords :
distributed processing; message passing; synchronisation; MPICH application-bypass broadcast; geographically distributed system; heterogeneous system; parallel program; synchronization; Broadcasting; Computer networks; Delay; Distributed computing; Grid computing; Information science; Intelligent networks; Laboratories; System performance; US Department of Energy;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd IEEE/ACM International Symposium on
Print_ISBN :
0-7695-1919-9
Type :
conf
DOI :
10.1109/CCGRID.2003.1199346
Filename :
1199346
Link To Document :
بازگشت