مرکز منطقه ای اطلاع رساني علوم و فناوري - Neural-network based online policy iteration for continuous-time infinite-horizon optimal control of nonlinear systems

DocumentCode :

735082

Title :

Neural-network based online policy iteration for continuous-time infinite-horizon optimal control of nonlinear systems

Author :

Difan Tang ; Lei Chen ; Zhao Feng Tian

Author_Institution :

Sch. of Mech. Eng., Univ. of Adelaide, Adelaide, SA, Australia

fYear :

2015

fDate :

12-15 July 2015

Firstpage :

792

Lastpage :

796

Abstract :

A new policy-iteration algorithm based on neural networks (NNs) is proposed in this paper to synthesize optimal control laws online for continuous-time nonlinear systems. Latest advances in this field have enabled synchronous policy iteration but require an additional tuning loop or a logic switch mechanism to maintain system stability. A new algorithm is thus derived in this paper to address this limitation. The optimal control law is found by solving the Hamilton-Jacobi-Bellman (HJB) equation for the associated value function via synchronous policy iteration in a critic-actor configuration. As a major contribution, a new form of NN approximation for the value function is proposed, offering the closed-loop system asymptotic stability without additional tuning scheme or logic switch mechanism. As a second contribution, an extended Kalman filter is introduced to estimate the critic NN parameters for fast convergence. The efficacy of the new algorithm is verified by simulations.

Keywords :

Kalman filters; closed loop systems; continuous time systems; control system synthesis; infinite horizon; neurocontrollers; nonlinear control systems; nonlinear filters; optimal control; stability; HJB equation; Hamilton-Jacobi-Bellman equation; NN approximation; NNs; associated value function; closed-loop system asymptotic stability; continuous-time infinite-horizon optimal control; continuous-time nonlinear systems; critic NN parameter estimation; critic-actor configuration; extended Kalman filter; logic switch mechanism; neural networks; online policy iteration; optimal control law synthesis; synchronous policy iteration; system stability; tuning loop; Approximation methods; Decision support systems; Dynamic programming; Markov processes; Radio frequency; Robustness; TV; machine learning; neural network; nonlinear system; optimal control; policy iteration;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Signal and Information Processing (ChinaSIP), 2015 IEEE China Summit and International Conference on

Conference_Location :

Chengdu

Type :

conf

DOI :

10.1109/ChinaSIP.2015.7230513

Filename :

7230513

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=735082