DocumentCode :
2717132
Title :
Discrete-time nonlinear HJB solution using Approximate dynamic programming: Convergence Proof
Author :
Al-Tamimi, Asma ; Lewis, Frank
Author_Institution :
Autom. & Robotics Res. Inst., Univ. of Texas at Arlington, Fort Worth, TX
fYear :
2007
fDate :
1-5 April 2007
Firstpage :
38
Lastpage :
43
Abstract :
In this paper, a greedy iteration scheme based on approximate dynamic programming (ADP), namely heuristic dynamic programming (HDP), is used to solve for the value function of the Hamilton Jacobi Bellman equation (HJB) that appears in discrete-time (DT) nonlinear optimal control. Two neural networks are used - one to approximate the value function and one to approximate the optimal control action. The importance of ADP is that it allows one to solve the HJB equation for general nonlinear discrete-time systems by using a neural network to approximate the value function. The importance of this paper is that the proof of convergence of the HDP iteration scheme is provided using rigorous methods for general discrete-time nonlinear systems with continuous state and action spaces. Two examples are provided in this paper. The first example is a linear system, where ADP is found to converge to the correct solution of the algebraic Riccati equation (ARE). The second example considers a nonlinear control system.
Keywords :
Riccati equations; convergence; discrete time systems; dynamic programming; heuristic programming; iterative methods; neural nets; nonlinear control systems; optimal control; Hamilton Jacobi Bellman equation; algebraic Riccati equation; approximate dynamic programming; convergence proof; discrete-time nonlinear optimal control; greedy iteration; heuristic dynamic programming; neural networks; optimal control action approximation; value function approximation; Convergence; Dynamic programming; Function approximation; Learning; Linear systems; Neural networks; Nonlinear equations; Optimal control; Riccati equations; Robotics and automation; Adaptive critics; Approximate dynamic programming; HJB; Policy iterations;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0706-0
Type :
conf
DOI :
10.1109/ADPRL.2007.368167
Filename :
4220812
Link To Document :
بازگشت