• DocumentCode
    2498672
  • Title

    Moving least-squares approximations for linearly-solvable MDP

  • Author

    Zhong, Mingyuan ; Todorov, Emanuel

  • Author_Institution
    Dept. of Appl. Math., Univ. of Washington, Seattle, WA, USA
  • fYear
    2011
  • fDate
    11-15 April 2011
  • Firstpage
    218
  • Lastpage
    225
  • Abstract
    By introducing Linearly-solvable Markov Decision Process (LMDP), a general class of nonlinear stochastic optimal control problems can be reduced to solving linear problems. However, in practice, LMDP defined on continuous state space remain difficult due to high dimensionality of the state space. Here we describe a new framework for finding this solution by using a moving least-squares approximation. We use efficient iterative solvers which do not require matrix factorization, so we could handle large numbers of bases. The basis functions are constructed based on collocation states which change over iterations of the algorithm, so as to provide higher resolution at the regions of state space that are visited more often. The shape of the bases is automatically defined given the collocation states, in a way that avoids gaps in the coverage and avoids fitting a tremendous amount of parameters. Numerical results on test problems are provided and demonstrate good behavior when scaled to problems with high dimensionality.
  • Keywords
    Markov processes; iterative methods; least squares approximations; nonlinear control systems; optimal control; stochastic systems; LMDP; Markov decision process; iterative solver; linearly-solvable MDP; moving least-squares approximation; nonlinear stochastic optimal control problem; Eigenvalues and eigenfunctions; Equations; Least squares approximation; Markov processes; Optimal control; Shape;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on
  • Conference_Location
    Paris
  • Print_ISBN
    978-1-4244-9887-1
  • Type

    conf

  • DOI
    10.1109/ADPRL.2011.5967383
  • Filename
    5967383