مرکز منطقه ای اطلاع رساني علوم و فناوري - Timesharing-tracking framework for decentralized reinforcement learning in fully cooperative multi-agent system

DocumentCode :

7822

Title :

Timesharing-tracking framework for decentralized reinforcement learning in fully cooperative multi-agent system

Author :

Xin Chen ; Bo Fu ; Yong He ; Min Wu

Author_Institution :

Sch. of Autom., China Univ. of Geo-Sci., Wuhan, China

Volume :

Issue :

fYear :

2014

fDate :

Apr-14

Firstpage :

127

Lastpage :

133

Abstract :

Dimension-reduced and decentralized learning is always viewed as an efficient way to solve multi-agent cooperative learning in high dimension. However, the dynamic environment brought by the concurrent learning makes the decentralized learning hard to converge and bad in performance. To tackle this problem, a timesharing-tracking framework (TTF), stemming from the idea that alternative learning in microscopic view results in concurrent learning in macroscopic view, is proposed in this paper, in which the joint-state best-response Q-learning (BRQ-learning) serves as the primary algorithm to adapt to the companions´ policies. With the properly defined switching principle, TTF makes all agents learn the best responses to others at different joint states. Thus from the view of the whole joint-state space, agents learn the optimal cooperative policy simultaneously. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with less computation and faster speed compared with other two classical learning algorithms.

Keywords :

learning (artificial intelligence); multi-agent systems; multi-robot systems; BRQ-learning; TTF; alternative learning; concurrent learning; decentralized reinforcement learning; fully cooperative multiagent system; joint-state best-response Q-learning; optimal cooperative policy learning; switching principle; timesharing-tracking framework; Games; Learning (artificial intelligence); Multi-agent systems; Optimization; Reinforcement learning; Robots; Switches; Cooperative multi-agent system; immediate individual reward; reinforcement learning; timesharing tracking;

fLanguage :

English

Journal_Title :

Automatica Sinica, IEEE/CAA Journal of

Publisher :

ieee

ISSN :

2329-9266

Type :

jour

DOI :

10.1109/JAS.2014.7004541

Filename :

7004541

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=7822