Adaptive Learning in Tracking Control Based on the Dual Critic Network Design

Author

Zhen Ni ; Haibo He ; Jinyu Wen

Author_Institution

Dept. of Electr., Comput. & Biomed. Eng., Univ. of Rhode Island, Kingston, RI, USA

Volume

24

Issue

6

fYear

2013

fDate

Jun-13

Firstpage

913

Lastpage

928

Abstract

In this paper, we present a new adaptive dynamic programming approach by integrating a reference network that provides an internal goal representation to help the systems learning and optimization. Specifically, we build the reference network on top of the critic network to form a dual critic network design that contains the detailed internal goal representation to help approximate the value function. This internal goal signal, working as the reinforcement signal for the critic network in our design, is adaptively generated by the reference network and can also be adjusted automatically. In this way, we provide an alternative choice rather than crafting the reinforcement signal manually from prior knowledge. In this paper, we adopt the online action-dependent heuristic dynamic programming (ADHDP) design and provide the detailed design of the dual critic network structure. Detailed Lyapunov stability analysis for our proposed approach is presented to support the proposed structure from a theoretical point of view. Furthermore, we also develop a virtual reality platform to demonstrate the real-time simulation of our approach under different disturbance situations. The overall adaptive learning performance has been tested on two tracking control benchmarks with a tracking filter. For comparative studies, we also present the tracking performance with the typical ADHDP, and the simulation results justify the improved performance with our approach.

Keywords

Lyapunov methods; approximation theory; control engineering computing; dynamic programming; learning (artificial intelligence); stability; tracking filters; virtual reality; ACD; ADHDP design; ADP; Lyapunov stability analysis; adaptive critic design; adaptive dynamic programming approach; adaptive learning; dual critic network design; dual critic network structure; internal goal representation; internal goal signal; online action-dependent heuristic dynamic programming design; real-time simulation; reference network integration; reinforcement signal; system optimization; tracking control; tracking filter; value function approximation; virtual reality platform; Adaptive systems; Dynamic programming; Erbium; Lyapunov methods; Nickel; Optimization; Vectors; Adaptive critic design (ACD); adaptive dynamic programming (ADP); internal goal; lyapunov stability analysis; online learning; reinforcement learning; tracking control; virtual reality;

fLanguage

English

Journal_Title

Neural Networks and Learning Systems, IEEE Transactions on

Publisher

ieee

ISSN

2162-237X

Type

jour

DOI

10.1109/TNNLS.2013.2247627

Filename

6476025