DocumentCode :
838139
Title :
Learning control of finite Markov chains with unknown transition probabilities
Author :
Sato, M. ; Abe, K. ; Takeda, H.
Author_Institution :
Tohoku University, Aramaki Aza Aoba, Sendai, Japan
Volume :
27
Issue :
2
fYear :
1982
fDate :
4/1/1982 12:00:00 AM
Firstpage :
502
Lastpage :
505
Abstract :
For a Markovian decision problem in which the transition probabilities are unknown, two learning algorithms are devised from the viewpoint of asymptotic optimality. Each time the algorithms select decisions to be used on the basis of not only the estimates of the unknown probabilities but also uncertainty of them. It is shown that the algorithms are asymptotically optimal in the sense that the probability of selecting an optimal policy converges to unity.
Keywords :
Learning control systems; Markov processes; Uncertain systems; Automatic control; Bayesian methods; Equations; Signal resolution; State-space methods; Uncertainty;
fLanguage :
English
Journal_Title :
Automatic Control, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9286
Type :
jour
DOI :
10.1109/TAC.1982.1102893
Filename :
1102893
Link To Document :
بازگشت