Title :
Optimizing power allocation to CPU and memory subsystems in overprovisioned HPC systems
Author :
Sarood, Osman ; Langer, Akhil ; Kale, Laxmikant ; Rountree, Barry ; de Supinski, Bronis
Author_Institution :
Dept. of Comput. Sci., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
Abstract :
Energy consumption and power draw pose two major challenges to the HPC community for designing larger systems. Present day HPC systems consume as much as 10MW of electricity and this is fast becoming a bottleneck. Although energy bills will significantly increase with machine size, power consumption is a hard constraint that must be addressed. Intel´s Running Average Power Limit (RAPL) toolkit is a recent feature that enables power capping of CPU and memory subsystems on modern hardware. In this paper, we use RAPL to evaluate the possibility of improving execution time efficiency of an application by capping power while adding more nodes. We profile the strong scaling of an application using different power caps for both CPU and memory subsystems. Our proposed interpolation scheme uses an application profile to optimize the number of nodes and the distribution of power between CPU and memory subsystems to minimize execution time under a strict power budget. We validate these estimates by running experiments on a 20-node (120 cores) Sandy Bridge cluster. Our experimental results closely match the model estimates and show speedups greater than 1.47X for all applications compared to not capping CPU and memory power. We demonstrate that the quality of solution that our interpolation scheme provides matches very closely to results obtained via exhaustive profiling.
Keywords :
interpolation; parallel processing; power aware computing; CPU; HPC community; Intel; RAPL toolkit; Sandy Bridge cluster; application profile; electricity; energy bills; energy consumption; execution time efficiency; interpolation scheme; memory subsystems; overprovisioned HPC systems; power 10 MW; power allocation optimization; power capping; power consumption; power draw; running average power limit toolkit; Energy consumption; Equations; Interpolation; Mathematical model; Memory management; Power demand;
Conference_Titel :
Cluster Computing (CLUSTER), 2013 IEEE International Conference on
Conference_Location :
Indianapolis, IN
DOI :
10.1109/CLUSTER.2013.6702684