• DocumentCode
    22235
  • Title

    Continuous-Time Q-Learning for Infinite-Horizon Discounted Cost Linear Quadratic Regulator Problems

  • Author

    Palanisamy, Muthukumar ; Modares, Hamidreza ; Lewis, Frank L. ; Aurangzeb, Muhammad

  • Author_Institution
    Dept. of Math., Gandhigram Rural Inst., Gandhigram, India
  • Volume
    45
  • Issue
    2
  • fYear
    2015
  • fDate
    Feb. 2015
  • Firstpage
    165
  • Lastpage
    176
  • Abstract
    This paper presents a method of Q-learning to solve the discounted linear quadratic regulator (LQR) problem for continuous-time (CT) continuous-state systems. Most available methods in the existing literature for CT systems to solve the LQR problem generally need partial or complete knowledge of the system dynamics. Q-learning is effective for unknown dynamical systems, but has generally been well understood only for discrete-time systems. The contribution of this paper is to present a Q-learning methodology for CT systems which solves the LQR problem without having any knowledge of the system dynamics. A natural and rigorous justified parameterization of the Q-function is given in terms of the state, the control input, and its derivatives. This parameterization allows the implementation of an online Q-learning algorithm for CT systems. The simulation results supporting the theoretical development are also presented.
  • Keywords
    continuous time systems; discrete time systems; infinite horizon; learning (artificial intelligence); linear quadratic control; nonlinear dynamical systems; CT continuous-state systems; LQR problem; discrete-time systems; infinite-horizon discounted cost linear quadratic regulator problem; online Q-learning algorithm; unknown dynamical systems; Approximation algorithms; Convergence; Discrete-time systems; Equations; Heuristic algorithms; Mathematical model; Optimal control; Approximate dynamic programming (ADP); Q-learning; continuous-time dynamical systems; infinite-horizon discounted cost function; integral reinforcement learning (IRL); optimal control; value iteration (VI); value iteration (VI).;
  • fLanguage
    English
  • Journal_Title
    Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-2267
  • Type

    jour

  • DOI
    10.1109/TCYB.2014.2322116
  • Filename
    6822502