Cooperative behavior acquisition by asynchronous policy renewal that enables simultaneous learning in multiagent environment

Author

Ikenoue, Shoichi ; Asada, Minoru ; Hosoda, Koh

Author_Institution

Dept. of Adaptive Machine Syst., Osaka Univ., Japan

Volume

3

fYear

2002

fDate

2002

Firstpage

2728

Abstract

This paper presents a method for simultaneous learning in multiagent environment to facilitate cooperative behavior. Each agent has one policy and one action value function: the former is for action execution based on the action value function updated in the previous stage, and the latter is for learning based on the episodes experienced by the current policy. This makes all agents behave based on the fixed policies, so that the non-Markovian problem can be avoided except for the update periods that depend on the learning progress of each agent. In order to avoid the local maxima due to such asynchronous renewal of action value functions, optimistic action values are given initially, which helps to avoid the exploration process being trapped in local maxima. The experimental results applied to one of the cooperative tasks in a dynamic, multiagent environment, RoboCup, is shown and a discussion is given.

Keywords

learning (artificial intelligence); mobile robots; multi-agent systems; multi-robot systems; robot dynamics; RoboCup; action value function; asynchronous policy renewal; cooperative behavior acquisition; cooperative task; dynamic multiagent environment; exploration process; mobile robots; multiagent environment; optimistic action values; simultaneous learning; soccer game situation; update periods; Adaptive systems; Centralized control; Communication switching; Communication system control; Control systems; Machine learning; Parallel robots;

fLanguage

English

Publisher

ieee

Conference_Titel

Intelligent Robots and Systems, 2002. IEEE/RSJ International Conference on

Print_ISBN

0-7803-7398-7

Type

conf

DOI

10.1109/IRDS.2002.1041682

Filename

1041682