• DocumentCode
    3129769
  • Title

    Adaptive Optimization of Markov Reward Processes

  • Author

    Campos-Nánez, Enrique ; Patek, Stephen D.

  • Author_Institution
    Department of Engineering Management and Systems Engineering, The George Washington University, 1776, G Street Washington, DC, 20052, USA ecamposn@gwu.edu
  • fYear
    2005
  • fDate
    12-15 Dec. 2005
  • Firstpage
    8034
  • Lastpage
    8041
  • Abstract
    We consider the problem of optimizing the average reward of Markov chains controlled by two sets of parameters 1) a set of tunable parameters and 2) a set of fixed but unknown parameters. We study the convergence characteristics of recursive estimation procedures based on the observation of regenerative cycles. We also provide sufficient conditions for the convergence to local optima of existing simulation-based optimization procedures under parameter certainty, in order to achieve simultaneous optimal selection of the tunable parameters and identification of the unknown parameters. To illustrate our approach, we discuss an algorithm which exploits the gradient of the likelihood of an observed regenerative cycle and its application to a regenerative simulation-based algorithm introduced in [1]. Our results are illustrated numerically in a problem of optimal pricing of services in a multi-class loss network.
  • Keywords
    Convergence; Dynamic programming; Modeling; Pricing; Q factor; Recursive estimation; State estimation; State-space methods; Sufficient conditions; Systems engineering and theory;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Decision and Control, 2005 and 2005 European Control Conference. CDC-ECC '05. 44th IEEE Conference on
  • Print_ISBN
    0-7803-9567-0
  • Type

    conf

  • DOI
    10.1109/CDC.2005.1583462
  • Filename
    1583462