• DocumentCode
    2748903
  • Title

    Cascaded execution: Speeding up unparallelized execution on shared-memory multiprocessors

  • Author

    Anderson, Ruth E. ; Nguyen, Thu D. ; Zahorjan, John

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Washington Univ., Seattle, WA, USA
  • fYear
    1999
  • fDate
    12-16 Apr 1999
  • Firstpage
    714
  • Lastpage
    719
  • Abstract
    Both inherently sequential code and limitations of analysis techniques prevent full parallelization of many applications by parallelizing compilers. Amdahl´s Law tells us that as parallelization becomes increasingly effective, any unparallelized loop becomes an increasingly dominant performance bottleneck. We present a technique for speeding up the execution of unparallelized loops by cascading their sequential execution across multiple processors: only a single processor executes the loop body at any one time, and each processor executes only a portion of the loop body before passing control to another. Cascaded execution allows otherwise idle processors to optimize their memory state for the eventual execution of their next portion of the loop, resulting in significantly reduced overall loop body execution times. We evaluate cascaded execution using loop nests from wave5, a Spec95fp benchmark application, and a synthetic benchmark. Running on a PC with 4 Pentium Pro processors and an SGI Power Onyx with 8 R10000 processors, we observe an overall speedup of 1.35 and 1.7, respectively, for the wave5 loops we examined and speedups as high as 4.5 for individual loops. Our extrapolated results using the synthetic benchmark show a potential for speedups as large as 16 on future machines
  • Keywords
    parallelising compilers; sequential codes; shared memory systems; analysis techniques; cascaded execution; inherently sequential code; sequential execution; shared-memory multiprocessors; synthetic benchmark; unparallelized execution; unparallelized loop; Application software; Computer science; Costs; Hardware; Law; Legal factors; Read only memory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 1999. 13th International and 10th Symposium on Parallel and Distributed Processing, 1999. 1999 IPPS/SPDP. Proceedings
  • Conference_Location
    San Juan
  • Print_ISBN
    0-7695-0143-5
  • Type

    conf

  • DOI
    10.1109/IPPS.1999.760554
  • Filename
    760554