A learning scheme for stationary probabilities of large markov chains with examples

Author

Borkar, V.S. ; Das, D.J. ; Banik, A. Datta ; Manjunath, D.

Author_Institution

Sch. of Technol. & Comput. Sci., Tata Inst. of Fundamental Res., Mumbai

fYear

2008

fDate

23-26 Sept. 2008

Firstpage

1097

Lastpage

1099

Abstract

We describe a reinforcement learning based scheme to estimate the stationary distribution of subsets of states of large Markov chains. dasiaSplit samplingpsila ensures that the algorithm needs to just encode the state transitions and will not need to know any other property of the Markov chain. (An earlier scheme required knowledge of the column sums of the transition probability matrix.) This algorithm is applied to analyze the stationary distribution of the states of a node in an 802.11 network.

Keywords

Markov processes; learning (artificial intelligence); 802.11 network; Markov chains; reinforcement learning; stationary probabilities; transition probability matrix; Algorithm design and analysis; Approximation algorithms; Computer science; Eigenvalues and eigenfunctions; Function approximation; Learning; Sampling methods; State estimation; Stochastic processes; Zinc;

fLanguage

English

Publisher

ieee

Conference_Titel

Communication, Control, and Computing, 2008 46th Annual Allerton Conference on

Conference_Location

Urbana-Champaign, IL

Print_ISBN

978-1-4244-2925-7

Electronic_ISBN

978-1-4244-2926-4

Type

conf

DOI

10.1109/ALLERTON.2008.4797682

Filename

4797682

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=2947164