DocumentCode
2845455
Title
Postponed Updates for Temporal-Difference Reinforcement Learning
Author
Van Seijen, Harm ; Whiteson, Shimon
Author_Institution
TNO Defence, Security & Safety, The Hague, Netherlands
fYear
2009
fDate
Nov. 30 2009-Dec. 2 2009
Firstpage
665
Lastpage
672
Abstract
This paper presents postponed updates, a new strategy for TD methods that can improve sample efficiency without incurring the computational and space requirements of model-based RL. By recording the agent´s last-visit experience, the agent can delay its update until the given state is revisited, thereby improving the quality of the update. Experimental results demonstrate that postponed updates outperforms several competitors, most notably eligibility traces, a traditional way to improve the sample efficiency of TD methods. It achieves this without the need to tune an extra parameter as is needed for eligibility traces.
Keywords
learning (artificial intelligence); model-based reinforcement learning; postponed updates; temporal-difference reinforcement learning; Computational efficiency; Delay; Informatics; Intelligent agent; Intelligent systems; Learning; Optimal control; Safety; Security; State estimation; eligibility traces; reinforcement learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Systems Design and Applications, 2009. ISDA '09. Ninth International Conference on
Conference_Location
Pisa
Print_ISBN
978-1-4244-4735-0
Electronic_ISBN
978-0-7695-3872-3
Type
conf
DOI
10.1109/ISDA.2009.76
Filename
5365052
Link To Document