Title :
Learning with Eligibility Traces in Adaptive Critic Designs
Author :
Xu, Jing ; Liang, Fu-Ming ; Yu, Wen-Sheng
Author_Institution :
Chinese Acad. of Sci., Beijing
Abstract :
In this paper, we study the training strategies of the critic network in adaptive critic designs. The conventional strategy is always to conduct an internal training cycle for the specified object at each time step based on relative information in two consecutive moments. Whereas, in our work, the mechanism of eligibility traces is adopted for the learning of the cost function or its derivatives. The new learning depends on the current error combined with traces of past events. For the classical single cart-pole balancing problem, we implement our idea with one typical adaptive critic design, i.e. action-dependent heuristic dynamic programming. And comparing results demonstrate our approach with more efficiency in the performances such as learning speed and success rate of learning.
Keywords :
dynamic programming; heuristic programming; learning (artificial intelligence); neural nets; adaptive critic design; classical single cart-pole balancing problem; eligibility trace; heuristic dynamic programming; internal training cycle; reinforcement learning; Adaptive systems; Cost function; Dynamic programming; Environmental economics; Function approximation; Learning; Monte Carlo methods; Nonlinear equations; Nonlinear systems; Operations research;
Conference_Titel :
Vehicular Electronics and Safety, 2006. ICVES 2006. IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
1-4244-0759-1
Electronic_ISBN :
1-4244-0759-1
DOI :
10.1109/ICVES.2006.371605