Title :
Metrics for Evaluating Energy Saving Techniques for Resilient HPC Systems
Author :
Grant, Ryan E. ; Olivier, Stephen L. ; Laros, James H. ; Brightwell, Ron ; Porterfield, Allan K.
Author_Institution :
Sandia Nat. Labs., Albuquerque, NM, USA
Abstract :
The metrics used for evaluating energy saving techniques for future HPC systems are critical to the correct assessment of proposed methods. Current predictions forecast that overcoming reduced system reliability, increased power requirements and energy consumption will be a major design challenge for future systems. Modern runtime energy-saving research efforts do not take into account the energy spent providing reliability. They also do not account for the increase in the probability of failure during application execution due to runtime overhead from energy saving methods. While this is very reasonable for current systems, it is insufficient for future generation systems. By taking into account the energy consumption ramifications of increased runtimes on system reliability, better energy saving techniques can be developed. This paper demonstrates how to determine the impact of runtime energy conservation methods within the context of failure-prone large scale systems. In addition, a survey of several energy savings methodologies is conducted and an analysis is performed with respect to their effectiveness in an environment in which failures occur.
Keywords :
parallel processing; power aware computing; probability; application execution; energy consumption ramifications; energy saving methods; energy saving techniques; failure probability; failure-prone large scale systems; metrics; power requirements; resilient HPC systems; runtime energy conservation methods; runtime energy-saving research efforts; system reliability; Checkpointing; Energy consumption; Equations; Measurement; Reliability; Runtime; Sockets; DVFS; HPC; energy saving; frequency scaling; power; reliability; voltage scaling;
Conference_Titel :
Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
Conference_Location :
Phoenix, AZ
Print_ISBN :
978-1-4799-4117-9
DOI :
10.1109/IPDPSW.2014.91