Title :
Single sample path based optimization of Markov systems: examples and algorithms
Author_Institution :
Hong Kong Univ. of Sci. & Technol., Kowloon, Hong Kong
Abstract :
Motivated by the needs of online optimization of real world engineering systems, we study the single sample path based algorithms for Markov decision problems (MDPs). We give a simple example to explain the advantages of the sample path based approach over the traditional computation based approach: matrix inversion is not required; some transition probabilities do not have to be known; it may save storage space; and it gives the flexibility of iterating the actions for a subset of the state space in each iteration. The effect of the estimation errors and the convergence property of the sample path based approach are studied. Finally, we propose a “fast” algorithm which updates the policy whenever the system reaches a particular set of states; the algorithm converges to the true optimal policy with probability one under some conditions
Keywords :
Markov processes; convergence; decision theory; optimisation; Markov decision problems; Markov systems; online optimization; real world engineering systems; single sample path based optimization; true optimal policy; Communication networks; Convergence; Estimation error; Manufacturing systems; Markov processes; Optimization; Performance analysis; State estimation; State-space methods; Systems engineering and theory;
Conference_Titel :
Decision and Control, 1997., Proceedings of the 36th IEEE Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
0-7803-4187-2
DOI :
10.1109/CDC.1997.650711