DocumentCode
1397802
Title
A unified framework for optimizing locality, parallelism, and communication in out-of-core computations
Author
Kandemir, Mahmut ; Choudhary, Alok ; Ramanujam, J. ; Kandaswamy, Meenakshi A.
Author_Institution
Dept. of Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
Volume
11
Issue
7
fYear
2000
fDate
7/1/2000 12:00:00 AM
Firstpage
648
Lastpage
668
Abstract
This paper presents a unified framework that optimizes out-of-core programs by exploiting locality and parallelism, and reducing communication overhead. For out-of-core problems where the data set sizes far exceed the size of the available in-core memory, it is particularly important to exploit the memory hierarchy by optimizing the I/O accesses. We present algorithms that consider both iteration space (loop) and data space (file layout) transformations in a unified framework. We show that the performance of an out-of-core loop nest containing references to out-of-core arrays can be improved by using a suitable combination of file layout choices and loop restructuring transformations. Our approach considers array references one-by-one and attempts to optimize each reference for parallelism and locality. When there are references for which parallelism optimizations do not work, communication is vectorized so that data transfer can be performed before the innermost loop. Results from hand-compiles on IBM SP-2 and Inter Paragon distributed-memory message-passing architectures show that this approach reduces the execution times and improves the overall speedups. In addition, we extend the base algorithm to work with file layout constraints and show how it is useful for optimizing programs that consist of multiple loop nests
Keywords
distributed memory systems; message passing; optimising compilers; data space; distributed-memory; file layout constraints; iteration space; locality; message-passing; out-of-core computations; parallelism; Algorithm design and analysis; Computer architecture; Computer science; Concurrent computing; Constraint optimization; Costs; Optimizing compilers; Parallel processing; Random access memory;
fLanguage
English
Journal_Title
Parallel and Distributed Systems, IEEE Transactions on
Publisher
ieee
ISSN
1045-9219
Type
jour
DOI
10.1109/71.877759
Filename
877759
Link To Document