DocumentCode :
1799340
Title :
Using supervised training signals of observable state dynamics to speed-up and improve reinforcement learning
Author :
Elliott, Daniel L. ; Anderson, C.
Author_Institution :
Dept. of Comput. Sci., Colorado State Univ., Fort Collins, CO, USA
fYear :
2014
fDate :
9-12 Dec. 2014
Firstpage :
1
Lastpage :
8
Abstract :
A common complaint about reinforcement learning (RL) is that it is too slow to learn a value function which gives good performance. This issue is exacerbated in continuous state spaces. This paper presents a straight-forward approach to speeding-up and even improving RL solutions by reusing features learned during a pre-training phase prior to Q-learning. During pre-training, the agent is taught to predict state change given a state/action pair. The effect of pre-training is examined using the model-free Q-learning approach but could readily be applied to a number of RL approaches including model-based RL. The analysis of the results provides ample evidence that the features learned during pre-training is the reason behind the improved RL performance.
Keywords :
learning (artificial intelligence); neural nets; state-space methods; RL performance improvement; RL solution improvement; continuous state spaces; feature reuse; model-based RL approach; model-free Q-learning approach; observable state dynamics; pretraining phase; reinforcement learning; state change prediction; state-action pair; supervised training signals; value function learning; Artificial neural networks; Computational modeling; Data models; Heuristic algorithms; Learning (artificial intelligence); Supervised learning; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2014 IEEE Symposium on
Conference_Location :
Orlando, FL
Type :
conf
DOI :
10.1109/ADPRL.2014.7010640
Filename :
7010640
Link To Document :
بازگشت