DocumentCode :
1925623
Title :
Optimizing Process-to-Core Mappings for Application Level Multi-dimensional MPI Communications
Author :
Karlsson, Christer ; Davies, Teresa ; Chen, Zizhong
Author_Institution :
Colorado Sch. of Mines, Golden, CO, USA
fYear :
2012
fDate :
24-28 Sept. 2012
Firstpage :
486
Lastpage :
494
Abstract :
Multi-dimensional MPI communications, where MPI communications have to be performed in each dimension of a Cartesian communicator, have been frequently used in many of today´s high performance computing applications. While individual MPI collective communications for regular communicators with a one-dimensional linear-ranking of processes have been extensively studied and optimized, little optimizations have been performed for multi-dimensional MPI collective communications on multi-dimensional Cartesian topology. In this paper, we optimize multi-dimensional MPI collective communications for SMP and multi-core systems at the application level. We show that the default Cartesian topology built by the state-of-the-art MPI implementations produce sub-optimal performance for multi-dimensional MPI collective communications. We design optimal process-to-core mapping schemes for Cartesian communicators to minimize the total inter-node communications. The proposed technique improves the performance by up to 80% over the default Cartesian topology built by Cray´s MPI implementation MPT 3.1.02 on the world´s current second fastest supercomputer Jaguar at Oak Ridge National Laboratory.
Keywords :
application program interfaces; message passing; multiprocessing systems; Cartesian communicator; Jaguar supercomputer; MPT 3.1.02; SMP; application level multidimensional MPI communication; default Cartesian topology; high performance computing application; multicore system; multidimensional Cartesian topology; multidimensional MPI collective communication; one-dimensional linear-ranking; optimal process-to-core mapping scheme; process-to-core mapping optimisation; total inter-node communication minimisation; Arrays; Multicore processing; Pipelines; Program processors; Standards; Tiles; Topology; Cartesian Topology; Collective Communication; Message Passing Interface (MPI); Multicore; Processto-Core Mapping;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cluster Computing (CLUSTER), 2012 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-2422-9
Type :
conf
DOI :
10.1109/CLUSTER.2012.47
Filename :
6337812
Link To Document :
بازگشت