مرکز منطقه ای اطلاع رساني علوم و فناوري - A Scalable Model-Free Recurrent Neural Network Framework for Solving POMDPs

DocumentCode :

2717337

Title :

A Scalable Model-Free Recurrent Neural Network Framework for Solving POMDPs

Author :

Liu, Zhenzhen ; Elhanany, Itamar

Author_Institution :

Dept. of Electr. & Comput. Eng., Tennessee Univ., Knoxville, TN

fYear :

2007

fDate :

1-5 April 2007

Firstpage :

119

Lastpage :

126

Abstract :

This paper presents a framework for obtaining an optimal policy in model-free partially observable Markov decision problems (POMDPs) using a recurrent neural network (RNN), A Q-function approximation approach is taken, utilizing a novel RNN architecture with computation and storage requirements that are dramatically reduced when compared to existing schemes. A scalable online training algorithm, derived from the real-time recurrent learning (RTRL) algorithm, is employed. Moreover, stochastic meta-descent (SMD), an adaptive step size scheme for stochastic gradient-descent problems, is utilized as means of incorporating curvature information to accelerate the learning process. We consider case studies of POMDPs where state information is not directly available to the agent. Particularly, we investigate scenarios in which the agent receives identical observations for multiple states, thereby relying on temporal dependencies captured by the RNN to obtain the optimal policy, Simulation results illustrate the effectiveness of the approach along with substantial improvement in convergence rate when compared to existing schemes

Keywords :

Markov processes; decision theory; learning (artificial intelligence); recurrent neural nets; Q-function approximation; constraint optimization; partially observable Markov decision problems; real-time recurrent learning; recurrent neural network; stochastic gradient-descent problems; stochastic meta-descent; Acceleration; Computational complexity; Computer architecture; Computer networks; Dynamic programming; Learning; Neurons; Nonlinear dynamical systems; Recurrent neural networks; Stochastic processes; Recurrent neural networks; constraint optimization; real-time recurrent learning (RTRL);

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on

Conference_Location :

Honolulu, HI

Print_ISBN :

1-4244-0706-0

Type :

conf

DOI :

10.1109/ADPRL.2007.368178

Filename :

4220823

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2717337