DocumentCode :
3600526
Title :
Network Performance Aware MPI Collective Communication Operations in the Cloud
Author :
Yifan Gong ; Bingsheng He ; Jianlong Zhong
Author_Institution :
Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore, Singapore
Volume :
26
Issue :
11
fYear :
2015
Firstpage :
3079
Lastpage :
3089
Abstract :
This paper examines the performance of collective communication operations in message passing interfaces (MPI) in the cloud computing environment. The awareness of network topology has been a key factor in performance optimizations for existing MPI implementations. However, virtualization in the cloud environment not only hides the network topology information from the users, but also causes traffic interference and dynamics to network performance. Existing topology-aware optimizations are no longer feasible in the cloud environment. Therefore, we develop novel network performance aware algorithms for a series of collective communication operations including broadcast, reduce, gather and scatter. We further implement two common applications, N-body and conjugate gradient (CG). We have conducted our experiments with two complementary methods (on Amazon EC2 and simulations). Our experimental results show that the network performance awareness results in 25.4 and 28.3 percent performance improvement over MPICH2 on Amazon EC2 and on simulations, respectively. Evaluations on N-body and CG show 41.6 and 14.3 percent respectively on application performance improvement.
Keywords :
application program interfaces; cloud computing; conjugate gradient methods; graph theory; message passing; network theory (graphs); optimisation; CG; MPI; N-body; cloud computing environment; collective communication operation; conjugate gradient; message passing interface; network performance aware algorithm; network topology awareness; performance optimization; Bandwidth; Cloud computing; Hardware; Network topology; Optimization; Topology; Virtual machining; Cloud computing; MPI; collective operations; network performance optimizations;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2013.96
Filename :
6490322
Link To Document :
بازگشت