DocumentCode :
1152877
Title :
Recursive Algorithms for Adaptive Control of Finite Markov Chains
Author :
El-Fattah, Yousri M.
Volume :
11
Issue :
2
fYear :
1981
Firstpage :
135
Lastpage :
144
Abstract :
The problem of controlling a finite Markov chain so as to maximize the long-run expected reward per unit time is studied. The chain´s transition probabilities depend upon an unknown parameter taking values in a subset [a,b] of Rn. A control policy is defined as the probability of selecting a control action for each state of the chain. Derived is a Taylor-like expansion formula for the expected reward in terms of policy variations. Based on that result a recursive stochastic gradient algorithm is presented for the adaptation of the control policy at consecutive times. The gradient depends on the estimated transition parameter which is also recursively updated using the gradient of the likelihood function. Convergence with probability 1 is proved for the control and estimation algorithms.
Keywords :
Adaptive control; Automatic control; Convergence; Dynamic programming; Learning automata; Linear programming; Optimal control; Parameter estimation; Recursive estimation; Stochastic processes;
fLanguage :
English
Journal_Title :
Systems, Man and Cybernetics, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9472
Type :
jour
DOI :
10.1109/TSMC.1981.4308638
Filename :
4308638
Link To Document :
بازگشت