• DocumentCode
    1920982
  • Title

    Parallelizing more Loops with Compiler Guided Refactoring

  • Author

    Larsen, Per ; Ladelsky, Razya ; Lidman, Jacob ; McKee, Sally A. ; Karlsson, Sven ; Zaks, Ayal

  • Author_Institution
    DTU Inf., Tech. Univ. of Denmark, Lyngby, Denmark
  • fYear
    2012
  • fDate
    10-13 Sept. 2012
  • Firstpage
    410
  • Lastpage
    419
  • Abstract
    The performance of many parallel applications relies not on instruction-level parallelism but on loop-level parallelism. Unfortunately, automatic parallelization of loops is a fragile process, many different obstacles affect or prevent it in practice. To address this predicament we developed an interactive compilation feedback system that guides programmers in iteratively modifying their application source code. This helps leverage the compiler´s ability to generate loop-parallel code. We employ our system to modify two sequential benchmarks dealing with image processing and edge detection, resulting in scalable parallelized code that runs up to 8.3 times faster on an eight-core Intel Xeon 5570 system and up to 12.5 times faster on a quad-core IBM POWER6 system. Benchmark performance varies significantly between the systems. This suggests that semi-automatic parallelization should be combined with target-specific optimizations. Furthermore, comparing the first benchmark to manually-parallelized, hand-optimized pthreads and OpenMP versions, we find that code generated using our approach typically outperforms the pthreads code (within 93-339%). It also performs competitively against the OpenMP code (within 75-111%). The second benchmark outperforms manually-parallelized and optimized OpenMP code (within 109-242%).
  • Keywords
    benchmark testing; edge detection; parallel programming; parallelising compilers; program control structures; software maintenance; software performance evaluation; application source code modification; automatic loop parallelization; benchmark performance; compiler guided refactoring; edge detection; eight-core Intel Xeon 5570 system; image processing; instruction-level parallelism; interactive compilation feedback system; loop-level parallelism; loop-parallel code generation; parallel application performance; quad-core IBM POWER6 system; scalable parallelized code; semiautomatic parallelization; sequential benchmarks; Arrays; Benchmark testing; Image edge detection; Kernel; Optimization; Production; Radiation detectors; Automatic Loop Parallelization; Compiler Feedback; Refactoring;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing (ICPP), 2012 41st International Conference on
  • Conference_Location
    Pittsburgh, PA
  • ISSN
    0190-3918
  • Print_ISBN
    978-1-4673-2508-0
  • Type

    conf

  • DOI
    10.1109/ICPP.2012.48
  • Filename
    6337602