DocumentCode
2644049
Title
Reinforcement learning in non-markovian environments using automatic discovery of subgoals
Author
Dung, Le Tien ; Komeda, Takashi ; Takagi, Motoki
Author_Institution
Shibaura Inst. of Technol., Tokyo
fYear
2007
fDate
17-20 Sept. 2007
Firstpage
2601
Lastpage
2605
Abstract
Learning time is always a critical issue in reinforcement learning, especially when recurrent neural networks (RNNs) are used to predict Q values. By creating useful subgoals, we can speed up learning performance. In this paper, we propose a method to accelerate learning in non-Markovian environments using automatic discovery of subgoals. Once subgoals are created, sub-policies use RNNs to attain them. Then learned RNNs are integrated into the main RNN as experts. Finally, the agent continues to learn using its new policy. Experiment results of the E maze problem and the virtual office problem show the potential of this approach.
Keywords
learning (artificial intelligence); prediction theory; recurrent neural nets; E maze problem; Q values prediction; nonMarkovian environments; recurrent neural networks; reinforcement learning; subgoal automatic discovery; virtual office problem; Acceleration; Electronic mail; Recurrent neural networks; Relays; Robots; State-space methods; Supervised learning; Systems engineering and theory; Teleworking; Selected keywords relevant to the subject.;
fLanguage
English
Publisher
ieee
Conference_Titel
SICE, 2007 Annual Conference
Conference_Location
Takamatsu
Print_ISBN
978-4-907764-27-2
Electronic_ISBN
978-4-907764-27-2
Type
conf
DOI
10.1109/SICE.2007.4421430
Filename
4421430
Link To Document