DocumentCode
2423163
Title
Tutor learning using linear constraints in approximate dynamic programming
Author
Di Castro, Dotan ; Mannor, Shie
Author_Institution
Fac. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa, Israel
fYear
2010
fDate
Sept. 29 2010-Oct. 1 2010
Firstpage
1384
Lastpage
1390
Abstract
In adaptive control, agents interacting with Markov Decision Processes typically face two types of setups. In the first setup, the environment´s model is known and dynamic programming and related methods are used to obtain the optimal control. In the second setup, the environment´s model is unknown and reinforcement learning methods are used. In this work we investigate a new setup that is a mix of the two mentioned setups: only part of the environment´s model is known and additional information regarding the environment is provided by a tutor. We formalize this problem using linear function approximation in order to overcome the “curse of dimensionality” phenomenon. In addition, using the Envelope Theorem, we show how one can tune the approximation basis in order to get a locally optimal results. Finally, the suggested methods are demonstrated in simulations.
Keywords
Markov processes; adaptive control; approximation theory; dynamic programming; learning (artificial intelligence); optimal control; Markov decision process; adaptive control; dynamic programming; envelope theorem; linear constraint; linear function approximation; optimal control; reinforcement learning; tutor learning; Approximation algorithms; Dynamic programming; Function approximation; Markov processes; Mathematical model; Optimization;
fLanguage
English
Publisher
ieee
Conference_Titel
Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on
Conference_Location
Allerton, IL
Print_ISBN
978-1-4244-8215-3
Type
conf
DOI
10.1109/ALLERTON.2010.5707075
Filename
5707075
Link To Document