DocumentCode :
2495422
Title :
Region enhanced neural Q-learning for solving model-based POMDPs
Author :
Wiering, Marco A. ; Kooi, Thijs
Author_Institution :
Dept. of Artificial Intell., Univ. of Groningen, Groningen, Netherlands
fYear :
2010
fDate :
18-23 July 2010
Firstpage :
1
Lastpage :
8
Abstract :
To get a robot to perform tasks autonomously, the robot has to plan its behavior and make decisions based on the input it receives. Unfortunately, contemporary robot sensors and actuators are subject to noise, rendering optimal decision making a stochastic process. To model this process, partially observable Markov decision processes (POMDPs) can be applied. In this paper we introduce the RENQ algorithm, a new POMDP algorithm that combines neural networks for estimating Q-values with the construction of a spatial pyramid over the state space. RENQ essentially uses region-based belief vectors together with state-based belief vectors, and these are used as inputs to the neural network trained with Q-learning. We compare RENQ to Qmdp and Perseus, two state-of-the-art algorithms for approximately solving model-based POMDPs. The results on three different maze navigation tasks indicate that RENQ outperforms Perseus on all problems and Qmdp if the problem becomes larger.
Keywords :
Markov processes; decision making; learning (artificial intelligence); neural nets; path planning; robots; actuators; autonomous robot; maze navigation task; model-based POMDP; neural Q-learning; partially observable Markov decision processes; rendering optimal decision making; robot sensors; stochastic process; Navigation; Robot sensing systems; Variable speed drives;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2010 International Joint Conference on
Conference_Location :
Barcelona
ISSN :
1098-7576
Print_ISBN :
978-1-4244-6916-1
Type :
conf
DOI :
10.1109/IJCNN.2010.5596811
Filename :
5596811
Link To Document :
بازگشت