Title :
Adaptive Optimization of Markov Reward Processes
Author :
Campos-Nánez, Enrique ; Patek, Stephen D.
Author_Institution :
Department of Engineering Management and Systems Engineering, The George Washington University, 1776, G Street Washington, DC, 20052, USA ecamposn@gwu.edu
Abstract :
We consider the problem of optimizing the average reward of Markov chains controlled by two sets of parameters 1) a set of tunable parameters and 2) a set of fixed but unknown parameters. We study the convergence characteristics of recursive estimation procedures based on the observation of regenerative cycles. We also provide sufficient conditions for the convergence to local optima of existing simulation-based optimization procedures under parameter certainty, in order to achieve simultaneous optimal selection of the tunable parameters and identification of the unknown parameters. To illustrate our approach, we discuss an algorithm which exploits the gradient of the likelihood of an observed regenerative cycle and its application to a regenerative simulation-based algorithm introduced in [1]. Our results are illustrated numerically in a problem of optimal pricing of services in a multi-class loss network.
Keywords :
Convergence; Dynamic programming; Modeling; Pricing; Q factor; Recursive estimation; State estimation; State-space methods; Sufficient conditions; Systems engineering and theory;
Conference_Titel :
Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC '05. 44th IEEE Conference on
Print_ISBN :
0-7803-9567-0
DOI :
10.1109/CDC.2005.1583462