From Robots to Reinforcement Learning

Author

Tongchun Du ; Cox, Michael T. ; Perlis, Don ; Shamwell, Jared ; Oates, Tim

Author_Institution

Dept. of Autom., Harbin Eng. Univ., Harbin, China

fYear

2013

fDate

4-6 Nov. 2013

Firstpage

540

Lastpage

545

Abstract

In this paper, we review recent advances in Reinforcement Learning (RL) in light of potential applications to robotics, introduce the basic concepts of RL and Markov Decision Process (MDP), and compare different RL algorithms such as Q-learning, Temporal Difference learning, the Actor Critic, and the Natural Actor Critic. We conclude that policy gradient methods are more suitable for solving continuous state/action MDP problems than RL with lookup tables or general function approximators. Further, natural policy gradient methods can efficiently converge to locally optimal solutions. Some simulation results are given to support our arguments. We also present a brief overview of our approach to developing an autonomous robot agent that can perceive, learn from and interact with the environment, and reason about and handle unexpected problems using its knowledge base.

Keywords

Markov processes; control engineering computing; decision theory; gradient methods; knowledge based systems; learning (artificial intelligence); mobile robots; Markov decision process; Q-learning; RL algorithms; autonomous robot agent; continuous state/action MDP problems; knowledge base; natural actor critic; natural policy gradient methods; reinforcement learning; robotics; temporal difference learning; Approximation algorithms; Cognition; Convergence; Function approximation; Gradient methods; Knowledge based systems; Robots; Natural Actor Critic; Reinforcement Learning; autonomous robots; policy gradients; robot knowledge base; value function;

fLanguage

English

Publisher

ieee

Conference_Titel

Tools with Artificial Intelligence (ICTAI), 2013 IEEE 25th International Conference on

Conference_Location

Herndon, VA

ISSN

1082-3409

Print_ISBN

978-1-4799-2971-9

Type

conf

DOI

10.1109/ICTAI.2013.86

Filename

6735297