Title :
P-3PC: a point-to-point communication model for automatic and optimal decomposition of regular domain problems
Author :
Seinstra, F.J. ; Koelma, D.
Author_Institution :
Intelligent Sensory Inf. Syst., Amsterdam Univ., Netherlands
fDate :
7/1/2002 12:00:00 AM
Abstract :
One of the most fundamental problems automatic parallelization tools are confronted with is to find an optimal domain decomposition for a given application. For regular domain problems (such as simple matrix manipulations), this task may seem trivial. However, communication costs in message-passing programs often depend significantly on the memory layout of data blocks to be transmitted. As a consequence, straightforward domain decompositions may be non-optimal. In this paper, we introduce a new point-to-point communication model, called P-3PC (Parameterized model based on the Three Paths of Communication), that is specifically designed to overcome this problem. In comparison with related models (e.g. LogGP), P-3PC is similar in complexity, but more accurate in many situations. Although the model is aimed at MPI´s standard point-to-point operations, it is applicable to similar message-passing definitions as well. The effectiveness of the model is tested in a framework for automatic parallelization of low-level image processing applications. Experiments are performed on two Beowulf-type systems, each having a different interconnection network and a different MPI implementation. The results show that, where other models frequently fail, P-3PC correctly predicts the communication costs related to any type of domain decomposition
Keywords :
application program interfaces; communication complexity; image processing; message passing; multiprocessor interconnection networks; problem solving; software performance evaluation; Beowulf-type systems; MPI; P-3PC; accuracy; automatic optimal domain decomposition; automatic parallelization; automatic parallelization tools; communication costs; communication paths; complexity; data block transmission; interconnection networks; low-level image processing applications; matrix manipulations; memory layout; message-passing programs; parameterized model; performance modeling; performance optimization; point-to-point communication model; regular domain problems; Automatic testing; Costs; Image processing; Matrix decomposition; Message passing; Multiprocessor interconnection networks; Optimization; Predictive models; Software design; Software performance;
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
DOI :
10.1109/TPDS.2002.1019863