• DocumentCode
    3678437
  • Title

    Push Me Pull You: Integrating Opposing Data Transport Modes for Efficient HPC Application Monitoring

  • Author

    Omar Aaziz;Jonathan Cook;Hadi Sharifi

  • Author_Institution
    Comput. Sci. Dept., New Mexico State Univ., Las Cruces, NM, USA
  • fYear
    2015
  • Firstpage
    674
  • Lastpage
    681
  • Abstract
    While HPC system monitoring is a necessary and accepted practice, applications are still basically opaque in the production environment. For better HPC platform management and utilization, especially as platforms push towards exascale size, HPC applications need to be more transparent in their execution in the production environment. PROMON is a framework for application monitoring in the production environment, but its design concentrated on the front end issues of offering easy to use application instrumentation. This paper presents the integration of PROMON with LDMS, a proven efficient HPC system monitoring framework. PROMON and LDMS offer a case study in integrating two disparate instrumentation and monitoring models, and the lessons are applicable to other HPC monitoring issues.
  • Keywords
    "Monitoring","Instruments","Production","Data collection","Measurement","Data structures","Libraries"
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2015.118
  • Filename
    7307667