DocumentCode
3129769
Title
Adaptive Optimization of Markov Reward Processes
Author
Campos-Nánez, Enrique ; Patek, Stephen D.
Author_Institution
Department of Engineering Management and Systems Engineering, The George Washington University, 1776, G Street Washington, DC, 20052, USA ecamposn@gwu.edu
fYear
2005
fDate
12-15 Dec. 2005
Firstpage
8034
Lastpage
8041
Abstract
We consider the problem of optimizing the average reward of Markov chains controlled by two sets of parameters 1) a set of tunable parameters and 2) a set of fixed but unknown parameters. We study the convergence characteristics of recursive estimation procedures based on the observation of regenerative cycles. We also provide sufficient conditions for the convergence to local optima of existing simulation-based optimization procedures under parameter certainty, in order to achieve simultaneous optimal selection of the tunable parameters and identification of the unknown parameters. To illustrate our approach, we discuss an algorithm which exploits the gradient of the likelihood of an observed regenerative cycle and its application to a regenerative simulation-based algorithm introduced in [1]. Our results are illustrated numerically in a problem of optimal pricing of services in a multi-class loss network.
Keywords
Convergence; Dynamic programming; Modeling; Pricing; Q factor; Recursive estimation; State estimation; State-space methods; Sufficient conditions; Systems engineering and theory;
fLanguage
English
Publisher
ieee
Conference_Titel
Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC '05. 44th IEEE Conference on
Print_ISBN
0-7803-9567-0
Type
conf
DOI
10.1109/CDC.2005.1583462
Filename
1583462
Link To Document