Abstract :
In this paper, we present a unified and systematic framework for the complete and efficient parallelization of a practical DO loop model. Specifically we discuss haw and in what order different transformations such as scalar expansion, array expansion, forward substitution, loop peeling, other cycle removal transformations, and various reduction recognition techniques can be integrated into a systematic framework and develop efficient algorithms for the maximal application of loop distribution. Based on the dependence concept, the framework presented optimizes the loop itself by eliminating the redundant loop nests as well as redundant code, which cannot be done by the classical data-flow analysis, thus improving the performance of the loop even on a scalar machine. The algorithms presented can also be used for the detection of induction, wraparound, flip-flop, periodic and non-linear induction variables.