DocumentCode :
3600639
Title :
Model-Free Optimal Control for Affine Nonlinear Systems With Convergence Analysis
Author :
Dongbin Zhao ; Zhongpu Xia ; Ding Wang
Author_Institution :
State Key Lab. of Manage. & Control for Complex Syst., Inst. of Autom., Beijing, China
Volume :
12
Issue :
4
fYear :
2015
Firstpage :
1461
Lastpage :
1468
Abstract :
In this paper, a self-learning control scheme is proposed for the infinite horizon optimal control of affine nonlinear systems based on the action dependent heuristic dynamic programming algorithm. The policy iteration technique is introduced to derive the optimal control policy with feasibility and convergence analysis. It shows that the “greedy” control action for each state is uniquely existent, the learned control policy after each policy iteration is admissible, and the optimal control policy is able to be obtained. Two three-layer perceptron neural networks are employed to implement the scheme. The critic network is trained by a novel rule to conform to the Bellman equation, and the action network is trained to yield a better control policy. Both training processes alternate until the optimal control policy is achieved. Two simulation examples are provided to validate the effectiveness of the approach. Note to Practitioners - The objective of designing optimal controllers without mathematical models is sought by control practitioners, whereas existing approaches usually derive optimal controllers by accessing the mathematical models or identified models. This paper proposes a new approach which derives optimal controllers by numerical iteration method without accessing any knowledge of the mathematical models. It gives evaluation for every state-action pair in the whole state-action space through the collected data of the underlying system, and then selects the action with the best evaluation for each state. What is required initial admissible control policy. Theorems show that optimal controllers can be acquired and simulation studies verify effectiveness. Further research will extend this approach to online self-learning optimal control approach, thus it can adapt the variation of underlying systems.
Keywords :
affine transforms; convergence; infinite horizon; iterative methods; multilayer perceptrons; nonlinear control systems; optimal control; self-adjusting systems; Bellman equation; action dependent heuristic dynamic programming algorithm; action network; admissible control policy; affine nonlinear system; control practitioner; convergence analysis; greedy control action; infinite horizon optimal control; learned control policy; mathematical model; model-free optimal control; numerical iteration method; online self-learning optimal control approach; optimal control policy; optimal controller; policy iteration technique; self-learning control scheme; state-action pair; state-action space; three-layer perceptron neural network; Dynamic programming; Heuristic algorithms; Mathematical model; Neural networks; Nonlinear systems; Optimal control; Action dependent heuristic dynamic programming; adaptive dynamic programming; model-free optimal control; neural networks; policy iteration;
fLanguage :
English
Journal_Title :
Automation Science and Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1545-5955
Type :
jour
DOI :
10.1109/TASE.2014.2348991
Filename :
6891335
Link To Document :
بازگشت