Title :
Learning control of finite Markov chains with unknown transition probabilities
Author :
Sato, M. ; Abe, K. ; Takeda, H.
Author_Institution :
Tohoku University, Aramaki Aza Aoba, Sendai, Japan
fDate :
4/1/1982 12:00:00 AM
Abstract :
For a Markovian decision problem in which the transition probabilities are unknown, two learning algorithms are devised from the viewpoint of asymptotic optimality. Each time the algorithms select decisions to be used on the basis of not only the estimates of the unknown probabilities but also uncertainty of them. It is shown that the algorithms are asymptotically optimal in the sense that the probability of selecting an optimal policy converges to unity.
Keywords :
Learning control systems; Markov processes; Uncertain systems; Automatic control; Bayesian methods; Equations; Signal resolution; State-space methods; Uncertainty;
Journal_Title :
Automatic Control, IEEE Transactions on
DOI :
10.1109/TAC.1982.1102893