DocumentCode :
838109
Title :
Kernel-based reinforcement learning in average-cost problems
Author :
Ormoneit, Dirk ; Glynn, Peter
Author_Institution :
Marshall Wace Asset Manage., London, UK
Volume :
47
Issue :
10
fYear :
2002
fDate :
10/1/2002 12:00:00 AM
Firstpage :
1624
Lastpage :
1636
Abstract :
Reinforcement learning (RL) is concerned with the identification of optimal controls in Markov decision processes (MDPs) where no explicit model of the transition probabilities is available. We propose a class of RL algorithms which always produces stable estimates of the value function. In detail, we use "local averaging" methods to construct an approximate dynamic programming (ADP) algorithm. Nearest-neighbor regression, grid-based approximations, and trees can all be used as the basis of this approximation. We provide a thorough theoretical analysis of this approach and we demonstrate that ADP converges to a unique approximation in continuous-state average-cost MDPs. In addition, we prove that our method is consistent in the sense that an optimal approximate strategy is identified asymptotically. With regard to a practical implementation, we suggest a reduction of ADP to standard dynamic programming in an artificial finite-state MDP.
Keywords :
Markov processes; decision theory; dynamic programming; iterative methods; learning (artificial intelligence); probability; Markov decision processes; average-cost problems; grid-based approximations; kernel-based reinforcement learning; local averaging methods; nearest-neighbor regression; optimal controls; trees; unique approximation; Convergence; Dynamic programming; Heuristic algorithms; Kernel; Learning; Neural networks; Optimal control; Stability; Table lookup; Training data;
fLanguage :
English
Journal_Title :
Automatic Control, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9286
Type :
jour
DOI :
10.1109/TAC.2002.803530
Filename :
1039798
Link To Document :
بازگشت