Title :
Actor–Critic-Based Optimal Tracking for Partially Unknown Nonlinear Discrete-Time Systems
Author :
Kiumarsi, B. ; Lewis, F.L.
Author_Institution :
UTA Res. Inst., Univ. of Texas at Arlington, Fort Worth, TX, USA
Abstract :
This paper presents a partially model-free adaptive optimal control solution to the deterministic nonlinear discrete-time (DT) tracking control problem in the presence of input constraints. The tracking error dynamics and reference trajectory dynamics are first combined to form an augmented system. Then, a new discounted performance function based on the augmented system is presented for the optimal nonlinear tracking problem. In contrast to the standard solution, which finds the feedforward and feedback terms of the control input separately, the minimization of the proposed discounted performance function gives both feedback and feedforward parts of the control input simultaneously. This enables us to encode the input constraints into the optimization problem using a nonquadratic performance function. The DT tracking Bellman equation and tracking Hamilton-Jacobi-Bellman (HJB) are derived. An actor-critic-based reinforcement learning algorithm is used to learn the solution to the tracking HJB equation online without requiring knowledge of the system drift dynamics. That is, two neural networks (NNs), namely, actor NN and critic NN, are tuned online and simultaneously to generate the optimal bounded control policy. A simulation example is given to show the effectiveness of the proposed method.
Keywords :
adaptive control; discrete time systems; feedback; feedforward; learning (artificial intelligence); minimisation; neurocontrollers; nonlinear control systems; optimal control; DT tracking Bellman equation; DT tracking control problem; Hamilton-Jacobi-Bellman; actor NN; actor-critic-based optimal tracking; actor-critic-based reinforcement learning algorithm; augmented system; critic NN; deterministic nonlinear discrete-time tracking control problem; feedback terms; feedforward terms; input constraints; neural networks; nonquadratic performance function; optimal bounded control policy; optimal nonlinear tracking problem; optimization problem; partially model-free adaptive optimal control solution; partially unknown nonlinear discrete-time systems; performance function minimization; reference trajectory dynamics; tracking HJB equation; tracking error dynamics; Equations; Feedforward neural networks; Heuristic algorithms; Mathematical model; Nonlinear dynamical systems; Standards; Trajectory; Actor–critic algorithm; Actor-critic algorithm; discrete-time (DT) nonlinear optimal tracking; input constraints; neural network (NN); reinforcement learning (RL);
Journal_Title :
Neural Networks and Learning Systems, IEEE Transactions on
DOI :
10.1109/TNNLS.2014.2358227