مرکز منطقه ای اطلاع رساني علوم و فناوري - Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark

DocumentCode :

2717600

Title :

Evaluation of Policy Gradient Methods and Variants on the Cart-Pole Benchmark

Author :

Riedmiller, Martin ; Peters, Jan ; Schaal, Stefan

Author_Institution :

Neuroinformatics Group, Osnabrueck Univ.

fYear :

2007

fDate :

1-5 April 2007

Firstpage :

254

Lastpage :

261

Abstract :

In this paper, we evaluate different versions from the three main kinds of model-free policy gradient methods, i.e., finite difference gradients, ´vanilla´ policy gradients and natural policy gradients. Each of these methods is first presented in its simple form and subsequently refined and optimized. By carrying out numerous experiments on the cart pole regulator benchmark we aim to provide a useful baseline for future research on parameterized policy search algorithms. Portable C++ code is provided for both plant and algorithms; thus, the results in this paper can be reevaluated, reused and new algorithms can be inserted with ease

Keywords :

finite difference methods; gradient methods; learning (artificial intelligence); search problems; cart pole regulator benchmark; finite difference gradients; model-free policy gradient methods; natural policy gradients; parameterized policy search algorithms; vanilla policy gradients; Dynamic programming; Finite difference methods; Gradient methods; Learning; Legged locomotion; Motor drives; Optimization methods; Regulators; Solids; Stochastic processes;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on

Conference_Location :

Honolulu, HI

Print_ISBN :

1-4244-0706-0

Type :

conf

DOI :

10.1109/ADPRL.2007.368196

Filename :

4220841

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2717600