DocumentCode :
349966
Title :
RTP-Q: a reinforcement learning system with an active exploration planning structure for enhancing the convergence rate
Author :
Zhao, Gang ; Tatsumi, Shoji ; Sun, Ruoying
Author_Institution :
Fac. of Eng., Osaka City Univ., Japan
Volume :
5
fYear :
1999
fDate :
1999
Firstpage :
475
Abstract :
In this paper, we propose an active exploring planning method in the prioritized sweeping reinforcement learning system to make an agent explore an environment efficiently. In order to plan an active exploration behavior, considering the estimate values feature of primitive learning system in our structure, we propose an exploration planning method that fully uses the learned model, plans an active exploration action and simplifies the setting of the parameters. The proposed system utilizes the learned model efficiently not only on computation of estimates, but also for realizing the active exploration to the environment. The comparison experiments of different methods on navigation tasks demonstrate the efficiency of the proposed method
Keywords :
learning (artificial intelligence); learning systems; navigation; path planning; Dyna Q architecture; active exploration planning; learning system; navigation; path planning; prioritized sweeping learning; reinforcement learning; Business; Convergence; Educational institutions; Engineering management; Environmental management; Learning; Navigation; Power system planning; Process planning; Sun;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on
Conference_Location :
Tokyo
ISSN :
1062-922X
Print_ISBN :
0-7803-5731-0
Type :
conf
DOI :
10.1109/ICSMC.1999.815597
Filename :
815597
Link To Document :
بازگشت