Title :
A Hessian actor-critic algorithm
Author :
Jing Wang ; Paschalidis, Ioannis C.
Author_Institution :
Div. of Syst. Eng., Boston Univ., Boston, MA, USA
Abstract :
We consider Markov Decision Processes (MDPs) following a policy parametrized by a parsimonious set of parameters and seek to optimize the policy over these parameters. In this setting, optimization can be done using a gradient ascent method. If designed well, the parameterized policy can significantly reduce the problem complexity. Existing algorithms usually suffer from slow convergence because they search along the gradient direction in a steepest ascent way. In this paper, we first propose an estimate for the Hessian of the overall reward the decision maker receives. Based on this estimate, we then introduce a new Newton-like method of the actor-critic type. We compare the new algorithm with several existing algorithms in a robotics application and demonstrate that our method exhibits faster convergence.
Keywords :
Markov processes; Newton method; decision making; gradient methods; optimisation; robots; Hessian actor-critic algorithm; MDP; Markov decision processes; Newton-like method; actor-critic type; decision maker; gradient ascent method; gradient direction; optimization; robotics application; steepest ascent way; Algorithm design and analysis; Convergence; Markov processes; Newton method; Radio frequency; Robots; Vectors; Actor-critic algorithms; Autonomous robots; Markov decision processes; Newton´s method;
Conference_Titel :
Decision and Control (CDC), 2014 IEEE 53rd Annual Conference on
Conference_Location :
Los Angeles, CA
Print_ISBN :
978-1-4799-7746-8
DOI :
10.1109/CDC.2014.7039533