Inverse Reinforcement Learning using Expectation Maximization in mixture models

Author

Hahn, Jurgen ; Zoubir, Abdelhak M.

Author_Institution

Signal Process. Group, Tech. Univ. Darmstadt, Darmstadt, Germany

fYear

2015

fDate

19-24 April 2015

Firstpage

3721

Lastpage

3725

Abstract

Reinforcement Learning (RL) is an attractive tool for learning optimal controllers in the sense of a given reward function. In conventional RL, usually an expert is required to design the reward function as the efficiency of RL strongly depends on the latter. An alternative has been presented by the concept of Inverse Reinforcement Learning (IRL), where the reward function is estimated from observed data. In this work, we propose a novel approach for IRL based on a generative probabilistic model of RL. We derive an Expectation Maximization algorithm that is able to simultaneously estimate the reward and the optimal policy for finite state and action spaces, which can be easily extended for the infinite cases. By means of two toy examples, we show that the proposed algorithm works well even with a low number of observations and converges after only a few iterations.

Keywords

expectation-maximisation algorithm; learning (artificial intelligence); mixture models; probability; IRL; action spaces; expectation maximization algorithm; finite state spaces; generative probabilistic model; inverse reinforcement learning; mixture models; optimal controllers; optimal policy; reward function; Integrated circuits; Integrated optics; Mixture models; Probabilistic logic; Expectation Maximization; Inverse Reinforcement Learning; Markov Decision Process;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178666

Filename

7178666