DocumentCode :
2717337
Title :
A Scalable Model-Free Recurrent Neural Network Framework for Solving POMDPs
Author :
Liu, Zhenzhen ; Elhanany, Itamar
Author_Institution :
Dept. of Electr. & Comput. Eng., Tennessee Univ., Knoxville, TN
fYear :
2007
fDate :
1-5 April 2007
Firstpage :
119
Lastpage :
126
Abstract :
This paper presents a framework for obtaining an optimal policy in model-free partially observable Markov decision problems (POMDPs) using a recurrent neural network (RNN), A Q-function approximation approach is taken, utilizing a novel RNN architecture with computation and storage requirements that are dramatically reduced when compared to existing schemes. A scalable online training algorithm, derived from the real-time recurrent learning (RTRL) algorithm, is employed. Moreover, stochastic meta-descent (SMD), an adaptive step size scheme for stochastic gradient-descent problems, is utilized as means of incorporating curvature information to accelerate the learning process. We consider case studies of POMDPs where state information is not directly available to the agent. Particularly, we investigate scenarios in which the agent receives identical observations for multiple states, thereby relying on temporal dependencies captured by the RNN to obtain the optimal policy, Simulation results illustrate the effectiveness of the approach along with substantial improvement in convergence rate when compared to existing schemes
Keywords :
Markov processes; decision theory; learning (artificial intelligence); recurrent neural nets; Q-function approximation; constraint optimization; partially observable Markov decision problems; real-time recurrent learning; recurrent neural network; stochastic gradient-descent problems; stochastic meta-descent; Acceleration; Computational complexity; Computer architecture; Computer networks; Dynamic programming; Learning; Neurons; Nonlinear dynamical systems; Recurrent neural networks; Stochastic processes; Recurrent neural networks; constraint optimization; real-time recurrent learning (RTRL);
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0706-0
Type :
conf
DOI :
10.1109/ADPRL.2007.368178
Filename :
4220823
Link To Document :
بازگشت