مرکز منطقه ای اطلاع رساني علوم و فناوري - Reinforcement learning with guided policy search using Gaussian processes

DocumentCode :

2771846

Title :

Reinforcement learning with guided policy search using Gaussian processes

Author :

Jakab, Hunor S. ; Csató, Lehel

Author_Institution :

Dept. of Comput. Sci., Babes-Bolyai Univ., Cluj-Napoca, Romania

fYear :

2012

fDate :

10-15 June 2012

Firstpage :

Lastpage :

Abstract :

Gradient based policy search algorithms benefit largely from the availability of a properly estimated state or state-action value function which can be used to reduce the variance of the gradient estimates. Additionally the use of Gaussian processes for value function approximation provides a fully probabilistic model where - using the uncertainty in the estimated value function - we can assess the amount of exploration required. In this article we present two modalities for adjusting different characteristics of the exploration in on-line learning of control policies for problems with continuous state-action spaces. The proposed methods exploit the fully probabilistic nature of the Gaussian processes and aims to constrain the exploration only to relevant subspaces, thereby speeding up convergence. We present experiments on a simulated control task to demonstrate the validity of our algorithms.

Keywords :

Gaussian processes; approximation theory; gradient methods; learning (artificial intelligence); Gaussian processes; estimated value function; function approximation; gradient based policy search algorithms; gradient estimation; guided policy search; probabilistic model; reinforcement learning; state action value function; Approximation algorithms; Function approximation; Gaussian processes; Noise; Robots; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks (IJCNN), The 2012 International Joint Conference on

Conference_Location :

Brisbane, QLD

ISSN :

2161-4393

Print_ISBN :

978-1-4673-1488-6

Electronic_ISBN :

2161-4393

Type :

conf

DOI :

10.1109/IJCNN.2012.6252509

Filename :

6252509

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2771846