مرکز منطقه ای اطلاع رساني علوم و فناوري - A statistical property of multiagent learning based on Markov decision process

DocumentCode :

1012959

Title :

A statistical property of multiagent learning based on Markov decision process

Author :

Iwata, Keiji ; Ikeda, Ken-ichi ; Sakai, Hiroki

Author_Institution :

Fac. of Inf. Sci., Hiroshima City Univ., Japan

Volume :

Issue :

fYear :

2006

fDate :

7/1/2006 12:00:00 AM

Firstpage :

829

Lastpage :

842

Abstract :

We exhibit an important property called the asymptotic equipartition property (AEP) on empirical sequences in an ergodic multiagent Markov decision process (MDP). Using the AEP which facilitates the analysis of multiagent learning, we give a statistical property of multiagent learning, such as reinforcement learning (RL), near the end of the learning process. We examine the effect of the conditions among the agents on the achievement of a cooperative policy in three different cases: blind, visible, and communicable. Also, we derive a bound on the speed with which the empirical sequence converges to the best sequence in probability, so that the multiagent learning yields the best cooperative result.

Keywords :

Markov processes; decision theory; learning (artificial intelligence); multi-agent systems; statistical analysis; asymptotic equipartition property; cooperative policy; empirical sequences; ergodic multiagent Markov decision process; multiagent learning; reinforcement learning; statistical property; Artificial intelligence; Concrete; Educational technology; Entropy; Informatics; Learning systems; Multiagent systems; Probability distribution; Stochastic processes; Stochastic systems; Asymptotic equipartition property (AEP); Markov decision process (MDP); multiagent system; reinforcement learning (RL); stochastic complexity (SC);

fLanguage :

English

Journal_Title :

Neural Networks, IEEE Transactions on

Publisher :

ieee

ISSN :

1045-9227

Type :

jour

DOI :

10.1109/TNN.2006.875990

Filename :

1650241

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1012959