مرکز منطقه ای اطلاع رساني علوم و فناوري - Improved Simultaneous Perturbation Stochastic Approximation and Its Application in Reinforcement Learning

DocumentCode :

1946553

Title :

Improved Simultaneous Perturbation Stochastic Approximation and Its Application in Reinforcement Learning

Author :

Yue, Xiumei

Author_Institution :

Dept. of Electr. & Electron. Eng., Huangshi Inst. of Technol., Huangshi

Volume :

fYear :

2008

fDate :

12-14 Dec. 2008

Firstpage :

329

Lastpage :

332

Abstract :

In the optimization problem which only measurements of the objective function are available, it is difficult or impossible to directly obtain the gradient of the objective function. Although the second order simultaneous perturbation stochastic approximation (2SPSA) algorithm solves this problem successfully by efficient gradient approximation that relies on measurements of the objective function, the accuracy of the algorithm depends on the matrix conditioning of the objective function Hessian. In order to eliminate the influence caused by the objective function Hessian, this paper uses nonlinear conjugate gradient method to decide the search direction of the objective function. By synthesizing different nonlinear conjugate gradient methods, it ensures each search direction to be descensive. Besides the search direction improvement, this paper also uses inexact line searches to decide the stepsize of movement. With the descensive search direction and appropriate stepsize, the improved SPSA converges faster than the 2SPSA. Through applying to reinforcement learning, the virtues of the improved SPSA are validated.

Keywords :

Hessian matrices; approximation theory; conjugate gradient methods; learning (artificial intelligence); stochastic processes; gradient approximation; matrix conditioning; nonlinear conjugate gradient method; objective function Hessian; reinforcement learning; simultaneous perturbation stochastic approximation algorithm; Acceleration; Application software; Approximation algorithms; Computer science; Convergence; Finite difference methods; Gradient methods; Learning; Software engineering; Stochastic processes; SPSA; nonlinear conjugate gradient method; reinforcement learning;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Science and Software Engineering, 2008 International Conference on

Conference_Location :

Wuhan, Hubei

Print_ISBN :

978-0-7695-3336-0

Type :

conf

DOI :

10.1109/CSSE.2008.1019

Filename :

4721754

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1946553