Title of article :
Multi-policy improvement in stochastic optimization
with forward recursive function criteria
Author/Authors :
Hyeong Soo Chang، نويسنده ,
Issue Information :
دوهفته نامه با شماره پیاپی سال 2005
Abstract :
Iwamoto recently established a formal transformation via an invariant imbedding to construct a
controlled Markov chain that can be solved in a backward manner, as in backward induction for
finite-horizon Markov decision processes (MDPs), for a given controlled Markov chain with nonadditive
forward recursive objective function criterion. Chang et al. presented formal methods, called
“parallel rollout” and “policy switching,” of combining given multiple policies in MDPs and showed
that the policies generated by both methods improve all of the policies that the methods combine.
This brief paper extends the methods of parallel rollout and policy switching for forward recursive
objective function criteria and shows that the similar property holds as in MDPs. We further discuss
how to implement these methods via simulation.
2004 Elsevier Inc. All rights reserved.
Keywords :
Forward recursive objective function , Associative dynamic programs , Invariant imbedding , Parallel rollout , Policyswitching
Journal title :
Journal of Mathematical Analysis and Applications
Journal title :
Journal of Mathematical Analysis and Applications