Learning from demonstration using a multi-valued function regressor for time-series data

Author

Butterfield, Jesse ; Osentoski, Sarah ; Jay, Graylin ; Jenkins, Odest Chadwicke

fYear

2010

fDate

6-8 Dec. 2010

Firstpage

328

Lastpage

333

Abstract

Using data collected from human teleoperation, our goal is to learn a control policy that maps perception to actuation. Such policies are potentially multi-valued with regard to perception with a single input mapping to multiple outputs depending on the user´s objective at a particular time. We propose a multi-valued function regressor to learn a larger class of robot control policies from human demonstration and extend the Hierarchical Dirichlet Process Hidden Markov Model to discover latent variables representing unknown objectives in the demonstrated data and the transitions between these objectives. Each of these objectives requires only a single-valued policy function, and thus can be learned with a Gaussian process function regressor. The learned transitions between these objectives determine the correct actuation where the complete policy function is multi-valued. We present the results of experiments conducted on the Nao humanoid robot platform.

Keywords

Gaussian processes; hidden Markov models; humanoid robots; learning (artificial intelligence); regression analysis; telecontrol; time series; Gaussian process; Nao humanoid robot platform; actuation; control policy; hidden Markov model; hierarchical Dirichlet process; human demonstration; human teleoperation; learning; multivalued function regressor; perception; robot control policies; time-series data; Head; Hidden Markov models; Humans; Kernel; Robot kinematics; Robot sensing systems;

fLanguage

English

Publisher

ieee

Conference_Titel

Humanoid Robots (Humanoids), 2010 10th IEEE-RAS International Conference on

Conference_Location

Nashville, TN

Print_ISBN

978-1-4244-8688-5

Electronic_ISBN

978-1-4244-8689-2

Type

conf

DOI

10.1109/ICHR.2010.5686284

Filename

5686284