Distributed reinforcement learning for power limited many-core system performance optimization

Author

Zhuo Chen ; Marculescu, Diana

fYear

2015

fDate

9-13 March 2015

Firstpage

1521

Lastpage

1526

Abstract

As power density emerges as the main constraint for many-core systems, controlling power consumption under the Thermal Design Power (TDP) while maximizing the performance becomes increasingly critical. To dynamically save power, Dynamic Voltage Frequency Scaling (DVFS) techniques have proved to be effective and are widely available commercially. In this paper, we present an On-line Distributed Reinforcement Learning (OD-RL) based DVFS control algorithm for many-core system performance improvement under power constraints. At the finer grain, a per-core Reinforcement Learning (RL) method is used to learn the optimal control policy of the Voltage/Frequency (VF) levels in a system model-free manner. At the coarser grain, an efficient global power budget reallocation algorithm is used to maximize the overall performance. The experiments show that compared to the state-of-the-art algorithms: 1) OD-RL produces up to 98% less budget overshoot, 2) up to 44.3x better throughput per over-the-budget energy and up to 23% higher energy efficiency, and 3) two orders of magnitude speedup over state-of-the-art techniques for systems with hundreds of cores.

Keywords

electronic engineering computing; learning (artificial intelligence); multiprocessing systems; optimal control; performance evaluation; power aware computing; resource allocation; DVFS techniques; OD-RL based DVFS control algorithm; TDP; distributed reinforcement learning; dynamic voltage frequency scaling techniques; energy efficiency; global power budget reallocation algorithm; many-core system performance improvement; online distributed reinforcement learning based DVFS control algorithm; optimal control policy; per-core reinforcement learning method; power consumption; power limited many-core system performance optimization; thermal design power; Algorithm design and analysis; Complexity theory; Learning (artificial intelligence); Multicore processing; Power demand; Scalability; Throughput;

fLanguage

English

Publisher

ieee

Conference_Titel

Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015

Conference_Location

Grenoble

Print_ISBN

978-3-9815-3704-8

Type

conf

Filename

7092630