DocumentCode
3678437
Title
Push Me Pull You: Integrating Opposing Data Transport Modes for Efficient HPC Application Monitoring
Author
Omar Aaziz;Jonathan Cook;Hadi Sharifi
Author_Institution
Comput. Sci. Dept., New Mexico State Univ., Las Cruces, NM, USA
fYear
2015
Firstpage
674
Lastpage
681
Abstract
While HPC system monitoring is a necessary and accepted practice, applications are still basically opaque in the production environment. For better HPC platform management and utilization, especially as platforms push towards exascale size, HPC applications need to be more transparent in their execution in the production environment. PROMON is a framework for application monitoring in the production environment, but its design concentrated on the front end issues of offering easy to use application instrumentation. This paper presents the integration of PROMON with LDMS, a proven efficient HPC system monitoring framework. PROMON and LDMS offer a case study in integrating two disparate instrumentation and monitoring models, and the lessons are applicable to other HPC monitoring issues.
Keywords
"Monitoring","Instruments","Production","Data collection","Measurement","Data structures","Libraries"
Publisher
ieee
Conference_Titel
Cluster Computing (CLUSTER), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/CLUSTER.2015.118
Filename
7307667
Link To Document