Author :
Wang, Haigeng ; Nicolau, Alexandru ; Siu, Kai-Yeng S.
Author_Institution :
Tibco Inc., Palo Alto, CA, USA
Abstract :
Prefix computation is a basic operation at the core of many important applications, e.g., some of the Grand Challenge problems, circuit design, digital signal processing, graph optimizations, and computational geometry. In this paper, we present new and strict time-optimal parallel schedules for prefix computation with resource constraints under the concurrent-read-exclusive-write (CREW) parallel random access machine (PRAM) model. For prefix of N elements on p processors (p independent of N) when N>p(p+1)/2, we derive Harmonic Schedules that achieve the strict optimal time (steps), [2(N-1)/(p+1)]. We also derive Pipelined Schedules that have better program-space efficiency than the Harmonic Schedule, yet only require a small constant number of steps more than the optimal time achieved by the Harmonic Schedule, Both the Harmonic Schedules and the Pipelined Schedules are simple and easy to implement. For prefix of N elements on p processors (p independent of N) where N⩽p(p+1)/2, the Harmonic Schedules are not time-optimal. For these cases, we establish an optimization method for determining key parameters of time-optimal schedules, based on connections between the structure of parallel prefix and Pascal´s triangle. Using the derived parameters, we devise an algorithm to construct such schedules. For a restricted class of values of N and p, we prove that the constructed schedules are strictly time-optimal. We also give strong empirical evidence that our algorithm constructs strict time optimal schedules for all cases where N⩽p(p+1)/2
Keywords :
combinatorial mathematics; optimisation; parallelising compilers; Grand Challenge problems; Pascal´s triangle; associative operations; circuit design; combinatorial optimization; computational geometry; concurrent-read-exclusive-write parallel random access machine; digital signal processing; graph optimizations; harmonic schedules; loop parallelization; loop-carried dependences; optimal schedules; parallel prefix; pipelined schedules; resource constraints; scan operator resource-constrained parallel algorithms; strict time lower bound; time-optimal parallel schedules; time-optimal schedules; tree-height reduction; Circuit synthesis; Computational geometry; Concurrent computing; Design optimization; Digital signal processing; Optimal scheduling; Phase change random access memory; Processor scheduling; Scheduling algorithm; Signal processing algorithms;