DocumentCode :
1519441
Title :
A generalized processor mapping technique for array redistribution
Author :
Hsu, Ching-Hsien ; Chung, Yeh-Ching ; Yang, Don-Lin ; Dow, Chyi-Ren
Author_Institution :
Dept. of Inf. Eng., Feng Chia Univ., Taichung, Taiwan
Volume :
12
Issue :
7
fYear :
2001
fDate :
7/1/2001 12:00:00 AM
Firstpage :
743
Lastpage :
757
Abstract :
In many scientific applications, array redistribution is usually required to enhance data locality and reduce remote memory access in many parallel programs on distributed memory multicomputers. Since the redistribution is performed at runtime, there is a performance trade-off between the efficiency of the new data decomposition for a subsequent phase of an algorithm and the cost of redistributing data among processors. In this paper, we present a generalized processor mapping technique to minimize the amount of data exchange for BLOCK-CYCLIC(kr) to BLOCK-CYCLIC(r) array redistribution and vice versa. The main idea of the generalized processor mapping technique is first to develop mapping functions for computing a new rank of each destination processor. Based on the mapping functions, a new logical sequence of destination processors can be derived. The new logical processor sequence is then used to minimize the amount of data exchange in a redistribution. The generalized processor mapping technique can handle array redistribution with arbitrary source and destination processor sets and can be applied to multidimensional array redistribution. We present a theoretical model to analyze the performance improvement of the generalized processor mapping technique. To evaluate the performance of the proposed technique, we have implemented the generalized processor mapping technique on an IBM SP2 parallel machine. The experimental results show that the generalized processor mapping technique can provide performance improvement over a wide range of redistribution problems
Keywords :
distributed memory systems; parallel programming; performance evaluation; IBM SP2 parallel machine; array redistribution; data decomposition; data exchange; data locality; distributed memory multicomputers; generalized processor mapping; generalized processor mapping technique; logical sequence; mapping functions; multidimensional array redistribution; parallel programs; performance improvement; performance trade-off; remote memory access; Application software; Computer Society; Costs; Helium; Multidimensional systems; Parallel machines; Parallel programming; Performance analysis; Phased arrays; Runtime;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/71.940748
Filename :
940748
Link To Document :
بازگشت