On-line learning of a feedback controller for quasi-passive-dynamic walking by a stochastic policy gradient method

Author

Hitomi, Kentarou ; Shibata, Tomohiro ; Nakamura, Yutaka ; Ishii, Shin

Author_Institution

Nara Inst. of Sci. & Technol., Japan

fYear

2005

fDate

2-6 Aug. 2005

Firstpage

3803

Lastpage

3808

Abstract

A class of biped locomotion called passive dynamic walking (PDW) has been recognized to be efficient in energy consumption and a key to understand human walking. Although PDW is sensitive to the initial condition and disturbances, some studies of quasi-PDW, which introduces supplementary actuators, are reported to overcome the sensitivity. In this article, for realization of the quasi-PDW, an on-line learning scheme of a feedback controller based on a policy gradient reinforcement learning method is proposed. Computer simulations show that the parameter in a quasi-PDW controller is automatically tuned by our method utilizing the passivity of the robot dynamics. The obtained controller is robust against variations in the slope gradient to some extent.

Keywords

adaptive control; control engineering computing; feedback; gradient methods; learning (artificial intelligence); legged locomotion; motion control; robot dynamics; stochastic processes; adaptive control; biped locomotion; feedback controller; online learning; policy gradient reinforcement learning; quasipassive-dynamic walking; robot dynamics; stochastic policy gradient method; Actuators; Adaptive control; Automatic control; Computer simulation; Energy consumption; Gradient methods; Humans; Learning; Legged locomotion; Stochastic processes; 2D biped; adaptive control; passive dynamic walk; reinforcement learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligent Robots and Systems, 2005. (IROS 2005). 2005 IEEE/RSJ International Conference on

Print_ISBN

0-7803-8912-3

Type

conf

DOI

10.1109/IROS.2005.1545258

Filename

1545258