Title :
Dynamically identifying regenerative cycles in Simulation-based Optimization Algorithms for Markov chains
Author :
Campos-Nanez, Enrique ; Patek, S.D.
Author_Institution :
Dept. of Eng. Manage. & Syst. Eng., George Washington Univ., DC, USA
fDate :
6/1/2004 12:00:00 AM
Abstract :
Simulation-based algorithms for maximizing the average reward of a parameterized Markov chain often rely upon the existence of a state which is recurrent for all choices of parameter values. The question of which recurrent state should serve to mark the end of a regenerative cycle is a very important practical consideration in applications. Even when all of the states of the process are recurrent, some states tend to be visited more often than others, and lengthy renewal cycles tend to result in high variance estimates of the gradient. To address this difficulty, we analyze a mechanism for adjusting this special state dynamically (i*-adaptation) as applied to the "batch" simulation-based optimization algorithm of a previous paper. We show that the desirable convergence properties of the original "batch" algorithm are retained with i*-adaptation, namely the almost sure convergence of the parameter vector to a critical point.
Keywords :
Markov processes; convergence; gradient methods; optimisation; Markov chains; convergence; gradient method; parameter vector; recurrent state; regenerative cycles identification; simulation based optimization; Algorithm design and analysis; Analytical models; Computational modeling; Convergence; Costs; Engineering profession; Recursive estimation; State estimation; State-space methods; Systems engineering and theory; Markov chains; regenerative cycles; simulation-based optimization;
Journal_Title :
Automatic Control, IEEE Transactions on
DOI :
10.1109/TAC.2004.829637