DocumentCode :
2423163
Title :
Tutor learning using linear constraints in approximate dynamic programming
Author :
Di Castro, Dotan ; Mannor, Shie
Author_Institution :
Fac. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa, Israel
fYear :
2010
fDate :
Sept. 29 2010-Oct. 1 2010
Firstpage :
1384
Lastpage :
1390
Abstract :
In adaptive control, agents interacting with Markov Decision Processes typically face two types of setups. In the first setup, the environment´s model is known and dynamic programming and related methods are used to obtain the optimal control. In the second setup, the environment´s model is unknown and reinforcement learning methods are used. In this work we investigate a new setup that is a mix of the two mentioned setups: only part of the environment´s model is known and additional information regarding the environment is provided by a tutor. We formalize this problem using linear function approximation in order to overcome the “curse of dimensionality” phenomenon. In addition, using the Envelope Theorem, we show how one can tune the approximation basis in order to get a locally optimal results. Finally, the suggested methods are demonstrated in simulations.
Keywords :
Markov processes; adaptive control; approximation theory; dynamic programming; learning (artificial intelligence); optimal control; Markov decision process; adaptive control; dynamic programming; envelope theorem; linear constraint; linear function approximation; optimal control; reinforcement learning; tutor learning; Approximation algorithms; Dynamic programming; Function approximation; Markov processes; Mathematical model; Optimization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on
Conference_Location :
Allerton, IL
Print_ISBN :
978-1-4244-8215-3
Type :
conf
DOI :
10.1109/ALLERTON.2010.5707075
Filename :
5707075
Link To Document :
بازگشت