Title :
Continuous adaptive critic designs
Author :
Hanselmann, Thomas ; Noakes, Lyle ; Zaknich, Anthony
Author_Institution :
Dept. of Electr. & Electron. Eng., Melbourne Univ., Parkville, Vic., Australia
fDate :
31 July-4 Aug. 2005
Abstract :
A continuous formulation of an adaptive critic design (ACD) is investigated. Connections to the discrete case are made, where backpropagation through time (BPTT) and realtime recurrent learning (RTRL) are prevalent. A second order actor adaptation, based on Newton´s method, is established for fast actor convergence. Also, a fast critic update for concurrent actor-critic training is outlined that keeps the Bellman optimality correct to first order approximation after actor changes.
Keywords :
Newton method; backpropagation; Bellman optimality; Newton method; actor adaptation; actor convergence; adaptive critic design; backpropagation through time; concurrent actor-critic training; realtime recurrent learning; Australia; Backpropagation; Continuous time systems; Cost function; Design engineering; Dynamic programming; Equations; Mathematics; Statistics; Terminology;
Conference_Titel :
Neural Networks, 2005. IJCNN '05. Proceedings. 2005 IEEE International Joint Conference on
Print_ISBN :
0-7803-9048-2
DOI :
10.1109/IJCNN.2005.1556403