• DocumentCode
    2423163
  • Title

    Tutor learning using linear constraints in approximate dynamic programming

  • Author

    Di Castro, Dotan ; Mannor, Shie

  • Author_Institution
    Fac. of Electr. Eng., Technion - Israel Inst. of Technol., Haifa, Israel
  • fYear
    2010
  • fDate
    Sept. 29 2010-Oct. 1 2010
  • Firstpage
    1384
  • Lastpage
    1390
  • Abstract
    In adaptive control, agents interacting with Markov Decision Processes typically face two types of setups. In the first setup, the environment´s model is known and dynamic programming and related methods are used to obtain the optimal control. In the second setup, the environment´s model is unknown and reinforcement learning methods are used. In this work we investigate a new setup that is a mix of the two mentioned setups: only part of the environment´s model is known and additional information regarding the environment is provided by a tutor. We formalize this problem using linear function approximation in order to overcome the “curse of dimensionality” phenomenon. In addition, using the Envelope Theorem, we show how one can tune the approximation basis in order to get a locally optimal results. Finally, the suggested methods are demonstrated in simulations.
  • Keywords
    Markov processes; adaptive control; approximation theory; dynamic programming; learning (artificial intelligence); optimal control; Markov decision process; adaptive control; dynamic programming; envelope theorem; linear constraint; linear function approximation; optimal control; reinforcement learning; tutor learning; Approximation algorithms; Dynamic programming; Function approximation; Markov processes; Mathematical model; Optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communication, Control, and Computing (Allerton), 2010 48th Annual Allerton Conference on
  • Conference_Location
    Allerton, IL
  • Print_ISBN
    978-1-4244-8215-3
  • Type

    conf

  • DOI
    10.1109/ALLERTON.2010.5707075
  • Filename
    5707075