مرکز منطقه ای اطلاع رساني علوم و فناوري - Stochastic policy gradient reinforcement learning on a simple 3D biped

DocumentCode :

2486864

Title :

Stochastic policy gradient reinforcement learning on a simple 3D biped

Author :

Tedrake, Russ ; Zhang, Teresa Weirui ; Seung, H. Sebastian

Author_Institution :

Center for Bits & Atoms, Massachusetts Inst. of Technol., Cambridge, MA, USA

Volume :

fYear :

2004

fDate :

28 Sept.-2 Oct. 2004

Firstpage :

2849

Abstract :

We present a learning system which is able to quickly and reliably acquire a robust feedback control policy for 3D dynamic walking from a blank-slate using only trials implemented on our physical robot. The robot begins walking within a minute and learning converges in approximately 20 minutes. This success can be attributed to the mechanics of our robot, which are modeled after a passive dynamic walker, and to a dramatic reduction in the dimensionality of the learning problem. We reduce the dimensionality by designing a robot with only 6 internal degrees of freedom and 4 actuators, by decomposing the control system in the frontal and sagittal planes, and by formulating the learning problem on the discrete return map dynamics. We apply a stochastic policy gradient algorithm to this reduced problem and decrease the variance of the update using a state-based estimate of the expected cost. This optimized learning system works quickly enough that the robot is able to continually adapt to the terrain as it walks.

Keywords :

adaptive systems; feedback; learning (artificial intelligence); learning systems; legged locomotion; optimisation; reduced order systems; robot dynamics; robust control; state estimation; 3D biped robot; 3D dynamic walking; discrete return map dynamics; optimized learning system; robust feedback control; state-based estimate; stochastic policy gradient algorithm; stochastic policy gradient reinforcement learning; Actuators; Control systems; Costs; Feedback control; Learning systems; Legged locomotion; Robots; Robust control; State estimation; Stochastic processes;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Robots and Systems, 2004. (IROS 2004). Proceedings. 2004 IEEE/RSJ International Conference on

Print_ISBN :

0-7803-8463-6

Type :

conf

DOI :

10.1109/IROS.2004.1389841

Filename :

1389841

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2486864