• DocumentCode
    1233669
  • Title

    Solving Controlled Markov Set-Chains With Discounting via Multipolicy Improvement

  • Author

    Chang, Hyeong Soo ; Chong, Edwin K P

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Sogang Univ., Seoul
  • Volume
    52
  • Issue
    3
  • fYear
    2007
  • fDate
    3/1/2007 12:00:00 AM
  • Firstpage
    564
  • Lastpage
    569
  • Abstract
    We consider Markov decision processes (MDPs) where the state transition probability distributions are not uniquely known, but are known to belong to some intervals-so called "controlled Markov set-chains"-with infinite-horizon discounted reward criteria. We present formal methods to improve multiple policies for solving such controlled Markov set-chains. Our multipolicy improvement methods follow the spirit of parallel rollout and policy switching for solving MDPs. In particular, these methods are useful for online control of Markov set-chains and for designing policy iteration (PI) type algorithms. We develop a PI-type algorithm and prove that it converges to an optimal policy
  • Keywords
    Markov processes; infinite horizon; stochastic systems; Markov decision process; controlled Markov set chains; infinite horizon discounted reward criteria; multipolicy improvement; parallel rollout; policy iteration algorithm; state transition probability distribution; Algorithm design and analysis; Computer science; Infinite horizon; Markov processes; Probability distribution; Robust control; Sensitivity analysis; Throughput; Traffic control; Controlled Markov process; Markov decision process (MDP); Markov set-chain; policy iteration; rollout;
  • fLanguage
    English
  • Journal_Title
    Automatic Control, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9286
  • Type

    jour

  • DOI
    10.1109/TAC.2007.892381
  • Filename
    4132899