DocumentCode :
2171332
Title :
Convergence of Q-learning with linear function approximation
Author :
Melo, Francisco S. ; Ribeiro, M. Isabel
Author_Institution :
Inst. for Syst. & Robot., Inst. Super. Tecnico, Lisbon, Portugal
fYear :
2007
fDate :
2-5 July 2007
Firstpage :
2671
Lastpage :
2678
Abstract :
In this paper, we analyze the convergence properties of Q-learning using linear function approximation. This algorithm can be seen as an extension to stochastic control settings of TD-learning using linear function approximation, as described in [1]. We derive a set of conditions that implies the convergence of this approximation method with probability 1, when a fixed learning policy is used. We provide an interpretation of the obtained approximation as a fixed point of a Bellman-like operator. We then discuss the relation of our result with several related works as well as its general applicability.
Keywords :
function approximation; learning (artificial intelligence); probability; stochastic systems; Bellman-like operator; Q-learning convergence; TD-learning; fixed learning policy; linear function approximation; probability; stochastic control settings; Approximation algorithms; Convergence; Function approximation; Learning (artificial intelligence); Process control; Robots;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control Conference (ECC), 2007 European
Conference_Location :
Kos
Print_ISBN :
978-3-9524173-8-6
Type :
conf
Filename :
7068926
Link To Document :
بازگشت