Title :
Natural gradient based reinforcement learning algorithm using active stimulating
Author :
Hao, Chuanchuan ; Fang, Zhou ; Li, Ping
Author_Institution :
Institute of Industrial Process Control, Zhejiang University, Hangzhou, China, 310027
Abstract :
Episodic Natural Actor-Critic (eNAC) algorithm is an important direct policy search algorithm which can guarantee the unbiasedness of the natural gradient estimate and have good learning result theoretically. But it has a major drawback: the system reset assumption. A novel algorithm, active stimulating based eNAC (AS-eNAC) algorithm, is proposed to release this restrictive assumption. AS-eNAC algorithm is an extension of eNAC algorithm by introducing an active stimulating procedure into the interaction process to generate the informative episodes automatically. As the initial state of the generated episodes may be different, which violates the prerequisite of the natural gradient estimate method of eNAC algorithm, a linear approximator of the initial state value function is employed in the natural gradient estimate process to improve the accuracy of the estimated natural gradient. Simulation results of the cart-pole balancing demonstrate the efficiency of the proposed algorithm.
Keywords :
active stimulating; cart-pole balancing; natural gradient estimate; reinforcement learning (RL);
Conference_Titel :
Automatic Control and Artificial Intelligence (ACAI 2012), International Conference on
Conference_Location :
Xiamen
Electronic_ISBN :
978-1-84919-537-9
DOI :
10.1049/cp.2012.1236