• DocumentCode
    2392870
  • Title

    Heuristic Dynamic Programming strategy with eligibility traces

  • Author

    Li, Tao ; Zhao, Dongbin ; Yi, Jianqiang

  • Author_Institution
    Lab. of Complex Syst. & Intell. Sci., Chinese Acad. of Sci., Beijing
  • fYear
    2008
  • fDate
    11-13 June 2008
  • Firstpage
    4535
  • Lastpage
    4540
  • Abstract
    In traditional adaptive dynamic programming (ADP), only one step estimate is considered for training process, Thus, learning efficiency is lower. If more steps estimates are included, learning process will be speed up. Eligibility traces record the past and current gradients of estimation. It can be used to work with ADP for speeding up learning. In this paper, heuristic dynamic programming (HDP) which is a typical structure of ADP is considered. An algorithm, HDP(lambda), integrating HDP with eligibility traces is presented. The algorithm is illustrated from both forward view and back view for clear comprehension. Equivalency of two views is analyzed. Furthermore, differences between HDP and HDP(lambda) are considered from both aspects of theoretic analysis and simulation results. The problem of balancing a pendulum robot (pendubot) is adopted as a benchmark. The results indicate that compared to HDP, HDP(lambda) shows higher convergence rate and training efficiency.
  • Keywords
    dynamic programming; learning (artificial intelligence); pendulums; robots; adaptive dynamic programming; eligibility traces; heuristic dynamic programming strategy; learning efficiency; pendulum robot balancing; training efficiency; training process; Algorithm design and analysis; Analytical models; Computational efficiency; Cost function; Delay; Dynamic programming; Learning; Optimal control; Robots; USA Councils; Adaptive dynamic programming; Eligibility trace; Heuristic dynamic programming; Pendulum robot;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    American Control Conference, 2008
  • Conference_Location
    Seattle, WA
  • ISSN
    0743-1619
  • Print_ISBN
    978-1-4244-2078-0
  • Electronic_ISBN
    0743-1619
  • Type

    conf

  • DOI
    10.1109/ACC.2008.4587210
  • Filename
    4587210