مرکز منطقه ای اطلاع رساني علوم و فناوري - Convergence of multiagent Q-learning: Multi action replay process approach

DocumentCode :

2718043

Title :

Convergence of multiagent Q-learning: Multi action replay process approach

Author :

Kim, Han-Eol ; Ahn, Hyo-Sung

Author_Institution :

Distrib. Control & Autonomous Syst. Lab., Gwangju Inst. of Sci. & Technol. (GIST), Gwangju, South Korea

fYear :

2010

fDate :

8-10 Sept. 2010

Firstpage :

789

Lastpage :

794

Abstract :

In this paper, we first suggest a new type of Markov model extended by Watkins´ action replay process. The new Markov model is called multi-action replay process (MARP), which is a process designed for multiagent coordination on the basis of reward values, state transition probabilities, and equilibrium strategy taking account of joint-action among agents. Using this model, multiagent Q-learning algorithm is then constructed as a cooperative reinforcement learning algorithm under completely connected agents. Finally, we prove that multiagent Q-learning values converge to optimal values. Simulation results are reported to illustrate the validity of the proposed multiagent Q-learning algorithm.

Keywords :

Markov processes; learning (artificial intelligence); multi-agent systems; MARP; Markov model; cooperative reinforcement learning algorithm; equilibrium strategy; multiaction replay process approach; multiagent Q-learning convergence; multiagent coordination; reward values; state transition probabilities; Algorithm design and analysis; Convergence; Equations; Games; Learning; Markov processes; Mathematical model;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Intelligent Control (ISIC), 2010 IEEE International Symposium on

Conference_Location :

Yokohama

ISSN :

2158-9860

Print_ISBN :

978-1-4244-5360-3

Electronic_ISBN :

2158-9860

Type :

conf

DOI :

10.1109/ISIC.2010.5612911

Filename :

5612911

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2718043