Title of article
Approximate receding horizon approach for Markov decision processes: average reward case ✩
Author/Authors
Hyeong Soo Chang، نويسنده ,
Issue Information
دوهفته نامه با شماره پیاپی سال 2003
Pages
16
From page
636
To page
651
Abstract
We consider an approximation scheme for solving Markov decision processes (MDPs) with countable
state space, finite action space, and bounded rewards that uses an approximate solution of a fixed
finite-horizon sub-MDP of a given infinite-horizon MDP to create a stationary policy, which we call
“approximate receding horizon control.” We first analyze the performance of the approximate receding
horizon control for infinite-horizon average reward under an ergodicity assumption, which also
generalizes the result obtained by White (J. Oper. Res. Soc. 33 (1982) 253–259). We then study two
examples of the approximate receding horizon control via lower bounds to the exact solution to the
sub-MDP. The first control policy is based on a finite-horizon approximation of Howard’s policy improvement
of a single policy and the second policy is based on a generalization of the single policy
improvement for multiple policies. Along the study, we also provide a simple alternative proof on the
policy improvement for countable state space. We finally discuss practical implementations of these
schemes via simulation.
2003 Elsevier Inc. All rights reserved.
Keywords
Receding horizon control , Markov decision process , Infinite-horizon average reward , Ergodicity , Policyimprovement , Rollout
Journal title
Journal of Mathematical Analysis and Applications
Serial Year
2003
Journal title
Journal of Mathematical Analysis and Applications
Record number
930854
Link To Document