Learning using multidimensional internal rewards

Author

Kobayashi, Yuichi ; Yuasa, Hideo ; Arai, Tamio

Author_Institution

Dept. of Precision Eng., Tokyo Univ., Japan

Volume

1

fYear

2000

fDate

2000

Firstpage

572

Abstract

Complicated tasks are often difficult to be expressed as single reward systems. In the human learning process, the relation between sensory inputs and action out-puts can be understood to have been acquired before-hand using an internal multidimensional reward system. We introduce reinforcement learning under multidimensional evaluation. The internal reward system includes both immediate evaluation and delayed rewards. The proposed architecture of the learning system is as a two layered Q-Learning system, which is combined with dynamic cell structure. We assume in the pushing task by a manipulator that information from touch sensors and motion detector of the vision system are available. The simulation showed that the acquired knowledge in the lower layer greatly helps to learn the pushing task

Keywords

image sensors; learning (artificial intelligence); robot programming; tactile sensors; action out-puts; complicated tasks; delayed rewards; dynamic cell structure; immediate evaluation; motion detector; multidimensional evaluation; multidimensional internal rewards; pushing task; reinforcement learning; sensory inputs; touch sensors; vision system; Delay; Detectors; Humans; Learning systems; Manipulators; Motion detection; Multidimensional systems; Precision engineering; Robot sensing systems; Tactile sensors;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligent Robots and Systems, 2000. (IROS 2000). Proceedings. 2000 IEEE/RSJ International Conference on

Conference_Location

Takamatsu

Print_ISBN

0-7803-6348-5

Type

conf

DOI

10.1109/IROS.2000.894665

Filename

894665