An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming

Author

Chang, Hyeong Soo ; Fu, Michael C. ; Hu, Jiaqiao ; Marcus, Steven I.

Author_Institution

Dept. of Comput. Sci. & Eng., Sogang Univ., Seoul

Volume

52

Issue

1

fYear

2007

Firstpage

89

Lastpage

94

Abstract

We present a simulation-based algorithm called "Simulated Annealing Multiplicative Weights" (SAMW) for solving large finite-horizon stochastic dynamic programming problems. At each iteration of the algorithm, a probability distribution over candidate policies is updated by a simple multiplicative weight rule, and with proper annealing of a control parameter, the generated sequence of distributions converges to a distribution concentrated only on the best policies. The algorithm is "asymptotically efficient," in the sense that for the goal of estimating the value of an optimal policy, a provably convergent finite-time upper bound for the sample mean is obtained

Keywords

dynamic programming; probability; simulated annealing; stochastic programming; Markov decision process; finite horizon stochastic dynamic programming; probability distribution; simulated annealing multiplicative weight; simulation-based algorithm; Computer science; Dynamic programming; Mathematics; Probability distribution; Random number generation; Simulated annealing; Statistics; Stochastic processes; Uncertainty; Upper bound; Learning algorithms; Markov decision processes; simulated annealing; simulation; stochastic dynamic programming;

fLanguage

English

Journal_Title

Automatic Control, IEEE Transactions on

Publisher

ieee

ISSN

0018-9286

Type

jour

DOI

10.1109/TAC.2006.887917

Filename

4060977