DocumentCode :
88632
Title :
Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming
Author :
Qinglai Wei ; Fei-Yue Wang ; Derong Liu ; Xiong Yang
Author_Institution :
State Key Lab. of Manage. & Control for Complex Syst., Inst. of Autom., Beijing, China
Volume :
44
Issue :
12
fYear :
2014
fDate :
Dec. 2014
Firstpage :
2820
Lastpage :
2833
Abstract :
In this paper, a new iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for infinite horizon discrete-time nonlinear systems with finite approximation errors. First, a new generalized value iteration algorithm of ADP is developed to make the iterative performance index function converge to the solution of the Hamilton-Jacobi-Bellman equation. The generalized value iteration algorithm permits an arbitrary positive semi-definite function to initialize it, which overcomes the disadvantage of traditional value iteration algorithms. When the iterative control law and iterative performance index function in each iteration cannot accurately be obtained, for the first time a new “design method of the convergence criteria” for the finite-approximation-error-based generalized value iteration algorithm is established. A suitable approximation error can be designed adaptively to make the iterative performance index function converge to a finite neighborhood of the optimal performance index function. Neural networks are used to implement the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the developed method.
Keywords :
adaptive control; approximation theory; discrete time systems; dynamic programming; infinite horizon; iterative methods; neurocontrollers; nonlinear control systems; optimal control; Hamilton-Jacobi-Bellman equation; finite approximation error; finite neighborhood; finite-approximation-error-based discrete-time iterative adaptive dynamic programming; finite-approximation-error-based generalized value iteration algorithm; infinite horizon discrete-time nonlinear systems; iterative ADP algorithm; iterative adaptive dynamic programming algorithm; iterative control law; iterative performance index function; neural networks; optimal control problem; optimal performance index function; positive semidefinite function; traditional value iteration algorithm; Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; approximation error; neural networks; neuro-dynamic programming; nonlinear systems; optimal control; reinforcement learning; value iteration;
fLanguage :
English
Journal_Title :
Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
2168-2267
Type :
jour
DOI :
10.1109/TCYB.2014.2354377
Filename :
6912005
Link To Document :
بازگشت