DocumentCode :
1731921
Title :
Timesharing-tracking: A new framework for decentralized reinforcement learning in cooperative multi-agent systems
Author :
Fu Bo ; Chen Xin ; He Yong ; Wu Min
Author_Institution :
Sch. of Inf. Sci. & Eng., Central South Univ., Changsha, China
fYear :
2013
Firstpage :
7054
Lastpage :
7059
Abstract :
The paper discusses how to learn the optimal cooperative policy in a decentralized way with known immediately individual reward. We propose a timesharing-tracking framework (TTF), in which agents learn their optimal policies alternatively on different states, in order to realize macroscopic simultaneous learning. Then the algorithm of the joint state Q-learning with best-response (BRQ-learning) to companions is proposed. Further, the BRQ-learning algorithm is extended into the TTF, so that the mechanism named multi-agent BRQ-learning with timesharing-tracking (BRQL-TT) is proposed to achieve optimal group policy. The simulation results illustrate that the proposed algorithm can learn the optimal joint behavior with less computation and faster speed comparing with other two classical learning algorithms.
Keywords :
learning (artificial intelligence); multi-agent systems; BRQ-learning with timesharing-tracking; BRQL-TT; TTF; best-response; cooperative multiagent systems; decentralized reinforcement learning; joint state Q-learning; macroscopic simultaneous learning; optimal cooperative policy; timesharing-tracking framework; Games; Joints; Learning (artificial intelligence); Multi-agent systems; Optimization; Robots; Switches; Cooperative learning; Immediately individual reward; Multi-agent system; Timesharing Tracking;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control Conference (CCC), 2013 32nd Chinese
Conference_Location :
Xi´an
Type :
conf
Filename :
6640678
Link To Document :
بازگشت