• DocumentCode
    2845455
  • Title

    Postponed Updates for Temporal-Difference Reinforcement Learning

  • Author

    Van Seijen, Harm ; Whiteson, Shimon

  • Author_Institution
    TNO Defence, Security & Safety, The Hague, Netherlands
  • fYear
    2009
  • fDate
    Nov. 30 2009-Dec. 2 2009
  • Firstpage
    665
  • Lastpage
    672
  • Abstract
    This paper presents postponed updates, a new strategy for TD methods that can improve sample efficiency without incurring the computational and space requirements of model-based RL. By recording the agent´s last-visit experience, the agent can delay its update until the given state is revisited, thereby improving the quality of the update. Experimental results demonstrate that postponed updates outperforms several competitors, most notably eligibility traces, a traditional way to improve the sample efficiency of TD methods. It achieves this without the need to tune an extra parameter as is needed for eligibility traces.
  • Keywords
    learning (artificial intelligence); model-based reinforcement learning; postponed updates; temporal-difference reinforcement learning; Computational efficiency; Delay; Informatics; Intelligent agent; Intelligent systems; Learning; Optimal control; Safety; Security; State estimation; eligibility traces; reinforcement learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems Design and Applications, 2009. ISDA '09. Ninth International Conference on
  • Conference_Location
    Pisa
  • Print_ISBN
    978-1-4244-4735-0
  • Electronic_ISBN
    978-0-7695-3872-3
  • Type

    conf

  • DOI
    10.1109/ISDA.2009.76
  • Filename
    5365052