Title :
Neural-network based online policy iteration for continuous-time infinite-horizon optimal control of nonlinear systems
Author :
Difan Tang ; Lei Chen ; Zhao Feng Tian
Author_Institution :
Sch. of Mech. Eng., Univ. of Adelaide, Adelaide, SA, Australia
Abstract :
A new policy-iteration algorithm based on neural networks (NNs) is proposed in this paper to synthesize optimal control laws online for continuous-time nonlinear systems. Latest advances in this field have enabled synchronous policy iteration but require an additional tuning loop or a logic switch mechanism to maintain system stability. A new algorithm is thus derived in this paper to address this limitation. The optimal control law is found by solving the Hamilton-Jacobi-Bellman (HJB) equation for the associated value function via synchronous policy iteration in a critic-actor configuration. As a major contribution, a new form of NN approximation for the value function is proposed, offering the closed-loop system asymptotic stability without additional tuning scheme or logic switch mechanism. As a second contribution, an extended Kalman filter is introduced to estimate the critic NN parameters for fast convergence. The efficacy of the new algorithm is verified by simulations.
Keywords :
Kalman filters; closed loop systems; continuous time systems; control system synthesis; infinite horizon; neurocontrollers; nonlinear control systems; nonlinear filters; optimal control; stability; HJB equation; Hamilton-Jacobi-Bellman equation; NN approximation; NNs; associated value function; closed-loop system asymptotic stability; continuous-time infinite-horizon optimal control; continuous-time nonlinear systems; critic NN parameter estimation; critic-actor configuration; extended Kalman filter; logic switch mechanism; neural networks; online policy iteration; optimal control law synthesis; synchronous policy iteration; system stability; tuning loop; Approximation methods; Decision support systems; Dynamic programming; Markov processes; Radio frequency; Robustness; TV; machine learning; neural network; nonlinear system; optimal control; policy iteration;
Conference_Titel :
Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on
Conference_Location :
Chengdu
DOI :
10.1109/ChinaSIP.2015.7230513