• DocumentCode
    587606
  • Title

    Deconstructing the overhead in parallel applications

  • Author

    Roth, Michael ; Best, M.J. ; Mustard, C. ; Fedorova, Alexandra

  • Author_Institution
    Simon Fraser Univ., Burnaby, BC, Canada
  • fYear
    2012
  • fDate
    4-6 Nov. 2012
  • Firstpage
    59
  • Lastpage
    68
  • Abstract
    Performance problems in parallel programs manifest as lack of scalability. These scalability issues are often very difficult to debug. They can stem from synchronization overhead, poor thread scheduling decisions, or contention for hardware resources, such as shared caches. Traditional profiling tools attribute program cycles to different functions, but do not generate immediate insight into issues limiting scalability. Profiling information is very program-specific and is usually processed manually by a human expert in a time-consuming and cumbersome process. Our experience in tuning performance of parallel applications led us to discover that performance tuning can be considerably simplified, and even to some degree automated, if profiling measurements are organized according to several intuitive performance factors common to most parallel programs. In this work we present these factors and propose a hierarchical framework composing them. We present three case studies where analyzing profiling data according to the proposed principle led us to improve performance of three parallel programs by a factor of 6-20×. Our work lays foundation for new ways of organizing and visualizing profiling data in performance tuning tools.
  • Keywords
    data visualisation; parallel programming; processor scheduling; program debugging; software performance evaluation; synchronisation; debug; hardware resource contention; overhead deconstruction; parallel application; parallel programs; performance factors; performance tuning tools; profiling data organization; profiling data visualization; profiling measurements; profiling tools; program-specific information; scalability issues; shared caches; synchronization overhead; thread scheduling decisions; Delay; Hardware; Program processors; Runtime; Scalability; Synchronization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Workload Characterization (IISWC), 2012 IEEE International Symposium on
  • Conference_Location
    La Jolla, CA
  • Print_ISBN
    978-1-4673-4531-6
  • Type

    conf

  • DOI
    10.1109/IISWC.2012.6402901
  • Filename
    6402901