Title :
A near-optimal reinforcement learning scheme for energy efficient point-to-point wireless communications
Author :
Pandana, Charles ; Liu, K. J Ray
Author_Institution :
Dept. of Electr. & Comput. Eng., Maryland Univ., College Park, MD, USA
fDate :
29 Nov.-3 Dec. 2004
Abstract :
We consider the problem of average throughput maximization per total consumed energy in packetized point-to-point wireless sensor communications. Our study results in an optimal transmission strategy that chooses the optimal modulation level and transmit power while adapting to the incoming traffic rate, buffer condition, and channel condition. We formulate the optimization problem as a Markov decision process (MDP). When the state transition probability of MDP is available, the optimal policy of MDP can be obtained using dynamic programming (DP). Since in practical situations, the state transition probability may not be available when the optimization is done, we propose to learn the near-optimal policy through the reinforcement learning (RL) algorithm. We show that the RL algorithm learns a policy that achieves almost the same throughput as the optimal one, and the learned policy obtains more than twice average throughput compared to the simple constant signal to noise ratio (CSNR) policy, particularly in high packet arrival rate. Moreover, the learning algorithm is robust in tracking the variation of the governing probability.
Keywords :
Markov processes; adaptive control; buffer storage; channel estimation; dynamic programming; learning (artificial intelligence); packet radio networks; power control; probability; queueing theory; telecommunication control; telecommunication traffic; wireless sensor networks; Markov decision process; average throughput maximization; buffer condition; channel condition; dynamic programming; energy efficient wireless communications; incoming traffic rate; near-optimal policy; near-optimal reinforcement learning; optimal modulation level; optimal transmission strategy; optimization problem; packet arrival rate; packetized wireless sensor communications; point-to-point wireless communications; state transition probability; total consumed energy; transmit power; Dynamic programming; Energy efficiency; Learning; Optimal control; Power control; Resource management; Signal to noise ratio; Throughput; Wireless communication; Wireless sensor networks;
Conference_Titel :
Global Telecommunications Conference, 2004. GLOBECOM '04. IEEE
Print_ISBN :
0-7803-8794-5
DOI :
10.1109/GLOCOM.2004.1378063