مرکز منطقه ای اطلاع رساني علوم و فناوري - Potential-based online policy iteration algorithms for Markov decision processes

Title of article :

Potential-based online policy iteration algorithms for Markov decision processes

Author/Authors :

FANG، Haitao نويسنده , , Cao، Xi-Ren نويسنده ,

Issue Information :

روزنامه با شماره پیاپی سال 2004

Pages :

From page :

493

To page :

505

Abstract :

Performance potentials play a crucial role in performance sensitivity analysis and policy iteration of Markov decision processes. The potentials can be estimated on a single sample path of a Markov process. In this paper, we propose two potential-based online policy iteration algorithms for performance optimization of Markov systems. The algorithms are based on online estimation of potentials and stochastic approximation. We prove that with these two algorithms the optimal policy can be attained after a finite number of iterations. A simulation example is given to illustrate the main ideas and the convergence rates of the algorithms.

Keywords :

Hydrograph

Journal title :

IEEE Transactions on Automatic Control

Serial Year :

2004

Journal title :

IEEE Transactions on Automatic Control

Record number :

97734

Link To Document :

https://search.isc.ac/dl/search/defaultta.aspx?DTC=10&DC=97734