Algorithms for Mapping Parallel Processes onto Grid and Torus Architectures

Author

Glantz, Roland ; Meyerhenke, Hening ; Noe, Alexander

Author_Institution

Inst. of Theor. Inf., Karlsruhe Inst. of Technol. (KIT), Karlsruhe, Germany

fYear

2015

fDate

4-6 March 2015

Firstpage

236

Lastpage

243

Abstract

Static mapping is the assignment of parallel processes to the processing elements (PEs) of a parallel system, where the assignment does not change during the application´s lifetime. In our scenario we model an application´s computations and their dependencies by an application graph. This graph is first partitioned into (nearly) equally sized blocks. These blocks need to communicate at block boundaries. To assign the processes to PEs, our goal is to compute a communication-efficient bijective mapping between the blocks and the PEs. This approach of partitioning followed by bijective mapping has many degrees of freedom. Thus, users and developers of parallel applications need to know more about which choices work for which application graphs and which parallel architectures. To this end, we not only develop new mapping algorithms (derived from known greedy methods). We also perform extensive experiments involving different classes of application graphs (meshes and complex networks), architectures of parallel computers (grids and tori), as well as different partitioners and mapping algorithms. Surprisingly, the quality of the partitions, unless very poor, has little influence on the quality of the mapping. More importantly, one of our new mapping algorithms always yields the best results in terms of the quality measure maximum congestion when the application graphs are complex networks. In case of meshes as application graphs, this mapping algorithm always leads in terms of maximum congestion AND maximum dilation, another common quality measure.

Keywords

graph theory; parallel algorithms; parallel architectures; application graphs; communication-efficient bijective mapping; greedy method; grid architecture; maximum congestion; maximum dilation; parallel architectures; parallel process mapping; parallel system; static mapping; torus architecture; Complex networks; Computational modeling; Computer architecture; Computers; Informatics; Partitioning algorithms; Topology; Topology mapping; grids tori; new variant of greedy;

fLanguage

English

Publisher

ieee

Conference_Titel

Parallel, Distributed and Network-Based Processing (PDP), 2015 23rd Euromicro International Conference on

Conference_Location

Turku

ISSN

1066-6192

Type

conf

DOI

10.1109/PDP.2015.21

Filename

7092726