• DocumentCode
    2498014
  • Title

    Parametric value function approximation: A unified view

  • Author

    Geist, Matthieu ; Pietquin, Olivier

  • Author_Institution
    IMS Res. Group, Supelec, Metz, France
  • fYear
    2011
  • fDate
    11-15 April 2011
  • Firstpage
    9
  • Lastpage
    16
  • Abstract
    Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists of learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. An important RL subtopic is to approximate this function when the system is too large for an exact representation. This survey reviews and unifies state of the art methods for parametric value function approximation by grouping them into three main categories: bootstrapping, residuals and projected fixed-point approaches. Related algorithms are derived by considering one of the associated cost functions and a specific way to minimize it, almost always a stochastic gradient descent or a recursive least-squares approach.
  • Keywords
    function approximation; gradient methods; learning (artificial intelligence); least squares approximations; bootstrapping approach; machine learning; optimal control policy; parametric value function approximation; projected fixed-point approach; recursive least-squares approach; reinforcement learning; residuals approach; stochastic gradient descent approach; Approximation algorithms; Cost function; Equations; Function approximation; Mathematical model; Prediction algorithms; Stochastic processes; Reinforcement learning; survey; value function approximation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on
  • Conference_Location
    Paris
  • Print_ISBN
    978-1-4244-9887-1
  • Type

    conf

  • DOI
    10.1109/ADPRL.2011.5967355
  • Filename
    5967355