• DocumentCode
    2852694
  • Title

    Replay-Based Synchronization of Timestamps in Event Traces of Massively Parallel Applications

  • Author

    Becker, Daniel ; Linford, John C. ; Rabenseifner, Rolf ; Wolf, Felix

  • Author_Institution
    Inst. for Adv. Simulation, Forschungszentrum Julich, Julich
  • fYear
    2008
  • fDate
    8-12 Sept. 2008
  • Firstpage
    212
  • Lastpage
    219
  • Abstract
    Event traces are helpful in understanding the performance behavior of message-passing applications since they allow in-depth analyses of communication and synchronization patterns. However, the absence of synchronized hardware clocks may render the analysis ineffective because inaccurate relative event timings can misrepresent the logical event order and lead to errors when quantifying the impact of certain behaviors. Although linear offset interpolation can restore consistency to some degree, inaccuracies and time-dependent drifts may still disarrange the original succession of events - especially during longer runs. In our earlier work, we have presented an algorithm that removes the remaining violations of the logical event order postmortem and, in addition, have outlined the initial design of a parallel version. Here, we complete the parallel design and describe its implementation within the SCALASCA trace-analysis framework. We demonstrate its suitability for large-scale applications running on more than a thousand application processes and show how the correction can improve the trace analysis of a real-world application example.
  • Keywords
    message passing; parallel processing; program diagnostics; software performance evaluation; synchronisation; SCALASCA trace-analysis framework; communication patterns analysis; event tracing; linear offset interpolation; logical event order postmortem; massively parallel applications; message-passing applications; performance behavior; replay-based synchronization; synchronization patterns analysis; timestamps; Algorithm design and analysis; Application software; Clocks; Computer science; Interpolation; Large-scale systems; Performance analysis; Scalability; Synchronization; Timing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing - Workshops, 2008. ICPP-W '08. International Conference on
  • Conference_Location
    Portland, OR
  • ISSN
    1530-2016
  • Print_ISBN
    978-0-7695-3375-9
  • Electronic_ISBN
    1530-2016
  • Type

    conf

  • DOI
    10.1109/ICPP-W.2008.17
  • Filename
    4626803