DocumentCode
977520
Title
Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors
Author
Agarwal, Anant ; Kranz, David A. ; Natarajan, Venkat
Author_Institution
Lab. for Comput. Sci., MIT, Cambridge, MA, USA
Volume
6
Issue
9
fYear
1995
fDate
9/1/1995 12:00:00 AM
Firstpage
943
Lastpage
962
Abstract
Presents a theoretical framework for automatically partitioning parallel loops to minimize cache coherency traffic on shared-memory multiprocessors. While several previous papers have looked at hyperplane partitioning of iteration spaces to reduce communication traffic, the problem of deriving the optimal tiling parameters for minimal communication in loops with general affine index expressions has remained open. Our paper solves this open problem by presenting a method for deriving an optimal hyperparallelepiped tiling of iteration spaces for minimal communication in multiprocessors with caches. We show that the same theoretical framework can also be used to determine optimal tiling parameters for both data and loop partitioning in distributed memory multicomputers. Our framework uses matrices to represent iteration and data space mappings and the notion of uniformly intersecting references to capture temporal locality in array references. We introduce the notion of data footprints to estimate the communication traffic between processors and use linear algebraic methods and lattice theory to compute precisely the size of data footprints. We have implemented this framework in a compiler for Alewife, a distributed shared-memory multiprocessor
Keywords
arrays; cache storage; distributed memory systems; iterative methods; lattice theory; matrix algebra; parallel programming; parallelising compilers; program control structures; shared memory systems; telecommunication traffic; Alewife compiler; affine index expressions; array references; automatic partitioning; cache coherency traffic minimization; data arrays; data footprints; data partitioning; data space mappings; distributed shared-memory multiprocessors; inter-processor communication traffic estimation; iteration spaces; lattice theory; linear algebra; loop partitioning; matrices; minimal communication; optimal hyperparallelepiped tiling; optimal tiling parameters; parallel loops; temporal locality; uniformly intersecting references; Context; Distributed computing; Laboratories; Lattices; Partitioning algorithms; Program processors; Programming profession; Shape; Tiles;
fLanguage
English
Journal_Title
Parallel and Distributed Systems, IEEE Transactions on
Publisher
ieee
ISSN
1045-9219
Type
jour
DOI
10.1109/71.466632
Filename
466632
Link To Document