DocumentCode :
2823207
Title :
Learning with Eligibility Traces in Adaptive Critic Designs
Author :
Xu, Jing ; Liang, Fu-Ming ; Yu, Wen-Sheng
Author_Institution :
Chinese Acad. of Sci., Beijing
fYear :
2006
fDate :
13-15 Dec. 2006
Firstpage :
309
Lastpage :
313
Abstract :
In this paper, we study the training strategies of the critic network in adaptive critic designs. The conventional strategy is always to conduct an internal training cycle for the specified object at each time step based on relative information in two consecutive moments. Whereas, in our work, the mechanism of eligibility traces is adopted for the learning of the cost function or its derivatives. The new learning depends on the current error combined with traces of past events. For the classical single cart-pole balancing problem, we implement our idea with one typical adaptive critic design, i.e. action-dependent heuristic dynamic programming. And comparing results demonstrate our approach with more efficiency in the performances such as learning speed and success rate of learning.
Keywords :
dynamic programming; heuristic programming; learning (artificial intelligence); neural nets; adaptive critic design; classical single cart-pole balancing problem; eligibility trace; heuristic dynamic programming; internal training cycle; reinforcement learning; Adaptive systems; Cost function; Dynamic programming; Environmental economics; Function approximation; Learning; Monte Carlo methods; Nonlinear equations; Nonlinear systems; Operations research;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Vehicular Electronics and Safety, 2006. ICVES 2006. IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
1-4244-0759-1
Electronic_ISBN :
1-4244-0759-1
Type :
conf
DOI :
10.1109/ICVES.2006.371605
Filename :
4234041
Link To Document :
بازگشت