• DocumentCode
    3426921
  • Title

    Least squares temporal difference actor-critic methods with applications to robot motion control

  • Author

    Estanjini, Reza Moazzez ; Ding, Xu Chu ; Lahijanian, Morteza ; Wang, Jing ; Belta, Calin A. ; Paschalidis, Ioannis Ch

  • Author_Institution
    Div. of Syst. Eng., Boston Univ., Boston, MA, USA
  • fYear
    2011
  • fDate
    12-15 Dec. 2011
  • Firstpage
    704
  • Lastpage
    709
  • Abstract
    We consider the problem of finding a control policy for a Markov Decision Process (MDP) to maximize the probability of reaching some states while avoiding some other states. This problem is motivated by applications in robotics, where such problems naturally arise when probabilistic models of robot motion are required to satisfy temporal logic task specifications. We transform this problem into a Stochastic Shortest Path (SSP) problem and develop a new approximate dynamic programming algorithm to solve it. This algorithm is of the actor-critic type and uses a least-square temporal difference learning method. It operates on sample paths of the system and optimizes the policy within a pre-specified class parameterized by a parsimonious set of parameters. We show its convergence to a policy corresponding to a stationary point in the parameters´ space. Simulation results confirm the effectiveness of the proposed solution.
  • Keywords
    Markov processes; dynamic programming; learning (artificial intelligence); least squares approximations; motion control; robots; search problems; Markov decision process; SSP problem; approximate dynamic programming algorithm; control policy; least squares method; least-square temporal difference learning method; probability; robot motion control; stochastic shortest path problem; temporal difference actor-critic method; temporal logic task specification; Approximation algorithms; Convergence; Heuristic algorithms; Materials requirements planning; Robot sensing systems; Vectors; Markov Decision Processes; actor-critic methods; dynamic programming; robot motion control; robotics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Decision and Control and European Control Conference (CDC-ECC), 2011 50th IEEE Conference on
  • Conference_Location
    Orlando, FL
  • ISSN
    0743-1546
  • Print_ISBN
    978-1-61284-800-6
  • Electronic_ISBN
    0743-1546
  • Type

    conf

  • DOI
    10.1109/CDC.2011.6160485
  • Filename
    6160485