• DocumentCode
    2432324
  • Title

    Adaptive linear quadratic control using policy iteration

  • Author

    Bradtke, Steven J. ; Ydstie, B. Erik ; Barto, Andrew G.

  • Author_Institution
    Dept. of Comput. & Inf. Sci., Massachusetts Univ., Amherst, MA, USA
  • Volume
    3
  • fYear
    1994
  • fDate
    29 June-1 July 1994
  • Firstpage
    3475
  • Abstract
    In this paper we present the stability and convergence results for dynamic programming-based reinforcement learning applied to linear quadratic regulation (LQR). The specific algorithm we analyze is based on Q-learning and it is proven to converge to an optimal controller provided that the underlying system is controllable and a particular signal vector is persistently excited. This is the first convergence result for DP-based reinforcement learning algorithms for a continuous problem.
  • Keywords
    adaptive control; discrete time systems; dynamic programming; intelligent control; iterative methods; learning (artificial intelligence); linear quadratic control; multivariable systems; stability; Q-learning; adaptive linear quadratic control; convergence; discrete time systems; dynamic programming-based reinforcement learning; multivariable system; optimal controller; policy iteration; signal vector; stability; Adaptive control; Computer science; Control systems; Cost function; Feedback control; Learning; Optimal control; Programmable control; Symmetric matrices; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    American Control Conference, 1994
  • Print_ISBN
    0-7803-1783-1
  • Type

    conf

  • DOI
    10.1109/ACC.1994.735224
  • Filename
    735224