مرکز منطقه ای اطلاع رساني علوم و فناوري - Goal Representation Heuristic Dynamic Programming on Maze Navigation

DocumentCode :

52230

Title :

Goal Representation Heuristic Dynamic Programming on Maze Navigation

Author :

Zhen Ni ; Haibo He ; Jinyu Wen ; Xin Xu

Author_Institution :

Dept. of Electr., Comput. & Biomed. Eng., Univ. of Rhode Island, Kingston, RI, USA

Volume :

Issue :

fYear :

2013

fDate :

Dec. 2013

Firstpage :

2038

Lastpage :

2050

Abstract :

Goal representation heuristic dynamic programming (GrHDP) is proposed in this paper to demonstrate online learning in the Markov decision process. In addition to the (external) reinforcement signal in literature, we develop an adaptively internal goal/reward representation for the agent with the proposed goal network. Specifically, we keep the actor-critic design in heuristic dynamic programming (HDP) and include a goal network to represent the internal goal signal, to further help the value function approximation. We evaluate our proposed GrHDP algorithm on two 2-D maze navigation problems, and later on one 3-D maze navigation problem. Compared to the traditional HDP approach, the learning performance of the agent is improved with our proposed GrHDP approach. In addition, we also include the learning performance with two other reinforcement learning algorithms, namely Sarsa(λ) and Q-learning, on the same benchmarks for comparison. Furthermore, in order to demonstrate the theoretical guarantee of our proposed method, we provide the characteristics analysis toward the convergence of weights in neural networks in our GrHDP approach.

Keywords :

Markov processes; approximation theory; learning (artificial intelligence); navigation; neural nets; 2D maze navigation; 3D maze navigation; GrHDP; Markov decision process; Q-learning; Sarsa(λ); actor-critic design; goal representation heuristic dynamic programming; neural networks; online learning; reinforcement learning; value function approximation; Benchmark testing; Convergence; Dynamic programming; Equations; Mathematical model; Navigation; Neural networks; Adaptive dynamic programming; Markov decision process; goal representation heuristic dynamic programming; maze navigation/path planning; reinforcement learning;

fLanguage :

English

Journal_Title :

Neural Networks and Learning Systems, IEEE Transactions on

Publisher :

ieee

ISSN :

2162-237X

Type :

jour

DOI :

10.1109/TNNLS.2013.2271454

Filename :

6565386

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=52230