Model-free Q-learning over finite horizon for uncertain linear continuous-time systems

Author

Hao Xu ; Jagannathan, Sarangapani

Author_Institution

Coll. of Sci. & Eng., Texas A&M Univ. - Corpus Christi, Corpus Christi, TX, USA

fYear

2014

fDate

9-12 Dec. 2014

Firstpage

1

Lastpage

6

Abstract

In this paper, a novel optimal control over finite horizon has been introduced for linear continuous-time systems by using adaptive dynamic programming (ADP). First, a new time-varying Q-function parameterization and its estimator are introduced. Subsequently, Q-function estimator is tuned online by using both Bellman equation in integral form and terminal cost. Eventually, near optimal control gain is obtained by using the Q-function estimator. All the closed-loop signals are shown to be bounded by using Lyapunov stability analysis where bounds are functions of initial conditions and final time while the estimated control signal converges close to the optimal value. The simulation results illustrate the effectiveness of the proposed scheme.

Keywords

Lyapunov methods; closed loop systems; continuous time systems; dynamic programming; integral equations; learning (artificial intelligence); linear systems; optimal control; stability; uncertain systems; Lyapunov stability analysis; Q-function estimator tuning; adaptive dynamic programming; closed-loop signals; finite horizon; initial conditions; integral form Bellman equation; model-free Q-learning; near optimal control gain; time-varying Q-function parameterization; uncertain linear continuous-time systems; Equations; Integral equations; Mathematical model; Optimal control; Parameter estimation; Tuning; Vectors; Adaptive Dynamics Programming (ADP); Forward-in-time; Optimal Control; Q-learning; Riccati Equation;

fLanguage

English

Publisher

ieee

Conference_Titel

Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014 IEEE Symposium on

Conference_Location

Orlando, FL

Type

conf

DOI

10.1109/ADPRL.2014.7010629

Filename

7010629