• DocumentCode
    1926420
  • Title

    Evaluating Power-Monitoring Capabilities on IBM Blue Gene/P and Blue Gene/Q

  • Author

    Yoshii, Kazutomo ; Iskra, Kamil ; Gupta, Rinku ; Beckman, Pete ; Vishwanath, Venkatram ; Yu, Chenjie ; Coghlan, Susan

  • Author_Institution
    Math. & Comput. Sci. Div., Argonne Nat. Lab., Argonne, IL, USA
  • fYear
    2012
  • fDate
    24-28 Sept. 2012
  • Firstpage
    36
  • Lastpage
    44
  • Abstract
    Power consumption is becoming a critical factor as we continue our quest toward exascale computing. Yet, actual power utilization of a complete system is an insufficiently studied research area. Estimating the power consumption of a large scale system is a nontrivial task because a large number of components are involved and because power requirements are affected by the (unpredictable) workloads. Clearly needed is a power-monitoring infrastructure that can provide timely and accurate feedback to system developers and application writers so that they can optimize the use of this precious resource. Many existing large-scale installations do feature power-monitoring sensors, however, those are part of environmental- and health monitoring sub systems and were not designed with application level power consumption measurements in mind. In this paper, we evaluate the existing power monitoring of IBM Blue Gene systems, with the goal of understanding what capabilities are available and how they fare with respect to spatial and temporal resolution, accuracy, latency, and other characteristics. We find that with a careful choice of dedicated micro benchmarks, we can obtain meaningful power consumption data even on Blue Gene/P, where the interval between available data points is measured in minutes. We next evaluate the monitoring subsystem on Blue Gene/Q, and are able to study the power characteristics of FPU and memory subsystems of Blue Gene/Q. We find the monitoring subsystem capable of providing second-scale resolution of power data conveniently separated between node components with seven seconds latency. This represents a significant improvement in power monitoring infrastructure, and hope future systems will enable real-time power measurement in order to better understand application behavior at a finer granularity.
  • Keywords
    computer power supplies; parallel machines; power consumption; system monitoring; FPU; IBM Blue Gene systems; IBM Blue Gene/P; IBM Blue Gene/Q; accurate feedback; application behavior; application level power consumption measurements; application writers; environmental monitoring subsystems; exascale computing; feature power-monitoring sensors; health monitoring subsystems; hope future systems; large scale system; large-scale installations; memory subsystems; power characteristics; power consumption data; power monitoring infrastructure; power requirements; power utilization; power-monitoring capabilities; power-monitoring infrastructure; real-time power measurement; second-scale resolution; spatial resolution; temporal resolution; Instruction sets; Memory management; Monitoring; Power demand; Power measurement; Stress; Blue Gene/P; Blue Gene/Q; HPC; Microbenchmarks; Power monitoring and profiling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cluster Computing (CLUSTER), 2012 IEEE International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4673-2422-9
  • Type

    conf

  • DOI
    10.1109/CLUSTER.2012.62
  • Filename
    6337854