• DocumentCode
    3678400
  • Title

    Distributed Modular Monitoring (DiMMon) Approach to Supercomputer Monitoring

  • Author

    Konstantin Stefanov;Vladimir Voevodin

  • Author_Institution
    Res. Comput. Center, Moscow State Univ., Moscow, Russia
  • fYear
    2015
  • Firstpage
    502
  • Lastpage
    503
  • Abstract
    In this work we propose a design for a new distributed modular monitoring system framework, which allows combining both monitoring tasks (supercomputer health and performance monitoring) in one monitoring system. Our approach allows different part of monitoring system to process only the data needed for the task assigned to this part. Another feature of our framework is the ability to calculate performance metrics on-the-fly, dynamically creating processing modules for every job or other objects of interest.
  • Keywords
    "Monitoring","Supercomputers","Measurement","Conferences","Filtering","Data collection"
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2015.83
  • Filename
    7307630