DocumentCode :
1444606
Title :
Simulation-based optimization of Markov reward processes
Author :
Marbach, Peter ; Tsitsiklis, John N.
Author_Institution :
Center for Commun. Syst. Res., Cambridge Univ., UK
Volume :
46
Issue :
2
fYear :
2001
fDate :
2/1/2001 12:00:00 AM
Firstpage :
191
Lastpage :
209
Abstract :
This paper proposes a simulation-based algorithm for optimizing the average reward in a finite-state Markov reward process that depends on a set of parameters. As a special case, the method applies to Markov decision processes where optimization takes place within a parametrized set of policies. The algorithm relies on the regenerative structure of finite-state Markov processes, involves the simulation of a single sample path, and can be implemented online. A convergence result (with probability 1) is provided
Keywords :
Markov processes; convergence of numerical methods; decision theory; optimisation; probability; Markov decision processes; Markov reward processes; convergence; finite-state Markov processes; optimization; probability; regenerative structure; simulation; Computational modeling; Convergence; Decision making; Dynamic programming; Laboratories; Markov processes; Optimization methods; State-space methods; Stochastic processes; Uncertainty;
fLanguage :
English
Journal_Title :
Automatic Control, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9286
Type :
jour
DOI :
10.1109/9.905687
Filename :
905687
Link To Document :
بازگشت