مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

114536

Title :

A Hessian actor-critic algorithm

Author :

Jing Wang ; Paschalidis, Ioannis C.

Author_Institution :

Div. of Syst. Eng., Boston Univ., Boston, MA, USA

fYear :

2014

fDate :

15-17 Dec. 2014

Firstpage :

1131

Lastpage :

1136

Abstract :

We consider Markov Decision Processes (MDPs) following a policy parametrized by a parsimonious set of parameters and seek to optimize the policy over these parameters. In this setting, optimization can be done using a gradient ascent method. If designed well, the parameterized policy can significantly reduce the problem complexity. Existing algorithms usually suffer from slow convergence because they search along the gradient direction in a steepest ascent way. In this paper, we first propose an estimate for the Hessian of the overall reward the decision maker receives. Based on this estimate, we then introduce a new Newton-like method of the actor-critic type. We compare the new algorithm with several existing algorithms in a robotics application and demonstrate that our method exhibits faster convergence.

Keywords :

Markov processes; Newton method; decision making; gradient methods; optimisation; robots; Hessian actor-critic algorithm; MDP; Markov decision processes; Newton-like method; actor-critic type; decision maker; gradient ascent method; gradient direction; optimization; robotics application; steepest ascent way; Algorithm design and analysis; Convergence; Markov processes; Newton method; Radio frequency; Robots; Vectors; Actor-critic algorithms; Autonomous robots; Markov decision processes; Newton´s method;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Decision and Control (CDC), 2014 IEEE 53rd Annual Conference on

Conference_Location :

Los Angeles, CA

Print_ISBN :

978-1-4799-7746-8

Type :

conf

DOI :

10.1109/CDC.2014.7039533

Filename :

7039533

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=114536