Adaptive dynamic programming

Author

Murray, John J. ; Cox, Chadwick J. ; Lendaris, George G. ; Saeks, Richard

Author_Institution

Dept. of Electr. Eng., State Univ. of New York, Stony Brook, NY, USA

Volume

32

Issue

2

fYear

2002

fDate

5/1/2002 12:00:00 AM

Firstpage

140

Lastpage

153

Abstract

Unlike the many soft computing applications where it suffices to achieve a "good approximation most of the time," a control system must be stable all of the time. As such, if one desires to learn a control law in real-time, a fusion of soft computing techniques to learn the appropriate control law with hard computing techniques to maintain the stability constraint and guarantee convergence is required. The objective of the paper is to describe an adaptive dynamic programming algorithm (ADPA) which fuses soft computing techniques to learn the optimal cost (or return) functional for a stabilizable nonlinear system with unknown dynamics and hard computing techniques to verify the stability and convergence of the algorithm. Specifically, the algorithm is initialized with a (stabilizing) cost functional and the system is run with the corresponding control law (defined by the Hamilton-Jacobi-Bellman equation), with the resultant state trajectories used to update the cost functional in a soft computing mode. Hard computing techniques are then used to show that this process is globally convergent with stepwise stability to the optimal cost functional/control law pair for an (unknown) input affine system with an input quadratic performance measure (modulo the appropriate technical conditions). Three specific implementations of the ADPA are developed for 1) the linear case, 2) for the nonlinear case using a locally quadratic approximation to the cost functional, and 3) the nonlinear case using a radial basis function approximation of the cost functional; illustrated by applications to flight control.

Keywords

adaptive control; aerospace control; convergence; dynamic programming; learning (artificial intelligence); nonlinear control systems; optimal control; stability; Hamilton-Jacobi-Bellman equation; adaptive control; adaptive dynamic programming algorithm; control law; convergence; flight control; hard computing; input affine system; input quadratic performance measure; nonlinear control; nonlinear system; optimal control; optimal cost functional; radial basis function approximation; real-time; soft computing applications; stability constraint; state trajectories; unknown dynamics; Computer applications; Control systems; Convergence; Cost function; Dynamic programming; Function approximation; Fuses; Heuristic algorithms; Nonlinear systems; Stability;

fLanguage

English

Journal_Title

Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on

Publisher

ieee

ISSN

1094-6977

Type

jour

DOI

10.1109/TSMCC.2002.801727

Filename

1039198