مرکز منطقه ای اطلاع رساني علوم و فناوري - A sampled fictitious play based learning algorithm for infinite horizon Markov Decision Processes

DocumentCode :

3277138

Title :

A sampled fictitious play based learning algorithm for infinite horizon Markov Decision Processes

Author :

Sisikoglu, Esra ; Epelman, Marina A. ; Smith, Robert L.

Author_Institution :

Univ. of Missouri, Columbia, MO, USA

fYear :

2011

fDate :

11-14 Dec. 2011

Firstpage :

4086

Lastpage :

4097

Abstract :

Using Sampled Fictitious Play (SFP) concepts, we develop SFPL: Sampled Fictitious Play Learning - a learning algorithm for solving discounted homogeneous Markov Decision Problems where the transition probabilities are unknown and need to be learned via simulation or direct observation of the system in real time. Thus, SFPL simultaneously updates the estimates of the unknown transition probabilities and the estimates of optimal value and optimal action in the observed state. In the spirit of SFP, the action after each transition is selected by sampling from the empirical distribution of previous optimal action estimates for the current state. The resulting algorithm is provably convergent. We compare its performance with other learning methods, including SARSA and Q-learning.

Keywords :

Markov processes; infinite horizon; learning (artificial intelligence); probability; Q-learning; SARSA; SFP concept; discounted homogeneous Markov decision problem; infinite horizon Markov decision process; sampled fictitious play learning; transition probability; Algorithm design and analysis; Approximation algorithms; Convergence; Games; Heuristic algorithms; History; Markov processes;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Simulation Conference (WSC), Proceedings of the 2011 Winter

Conference_Location :

Phoenix, AZ

ISSN :

0891-7736

Print_ISBN :

978-1-4577-2108-3

Electronic_ISBN :

0891-7736

Type :

conf

DOI :

10.1109/WSC.2011.6148098

Filename :

6148098

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3277138