• DocumentCode
    1689346
  • Title

    Accurately measuring collective operations at massive scale

  • Author

    Hoefler, Torsten ; Schneider, Timo ; Lumsdaine, Andrew

  • Author_Institution
    Open Syst. Lab., Indiana Univ., Bloomington, IN
  • fYear
    2008
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Accurate, reproducible and comparable measurement of collective operations is a complicated task. Although different measurement schemes are implemented in well- known benchmarks, many of these schemes introduce different systematic errors in their measurements. We characterize these errors and select a window-based approach as the most accurate method. However, this approach complicates measurements significantly and introduces a clock synchronization as a new source of systematic errors. We analyze approaches to avoid or correct those errors and develop a scalable synchronization scheme to conduct benchmarks on massively parallel systems. Our results are compared to the window-based scheme implemented in the SKaMPI benchmarks and show a reduction of the synchronization overhead by a factor of 16 on 128 processes.
  • Keywords
    application program interfaces; message passing; parallel processing; synchronisation; SKaMPI benchmark; clock synchronization; massively parallel system measurement; performance analysis; window-based scheme; Clocks; Computer errors; Computer science; Concurrent computing; Delay; Error analysis; Laboratories; Open systems; Predictive models; Synchronization; MPI; benchmarking; collective operations; scalable synchronization; time synchronization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing, 2008. IPDPS 2008. IEEE International Symposium on
  • Conference_Location
    Miami, FL
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4244-1693-6
  • Electronic_ISBN
    1530-2075
  • Type

    conf

  • DOI
    10.1109/IPDPS.2008.4536494
  • Filename
    4536494