• DocumentCode
    250543
  • Title

    Sample path sharing in simulation-based policy improvement

  • Author

    Di Wu ; Qing-Shan Jia ; Chun-Hung Chen

  • Author_Institution
    Dept. of Autom., Tsinghua Univ., Beijing, China
  • fYear
    2014
  • fDate
    May 31 2014-June 7 2014
  • Firstpage
    3291
  • Lastpage
    3296
  • Abstract
    Simulation-based policy improvement (SBPI) has been widely used to improve given base policies through simulation. The basic idea of SBPI is to estimate all the Q-factors for a given state using simulation, and then select the action that achieves the minimal cost. It is therefore of great importance to efficiently use the given budget in order to select the best action with high probability. Different from existing budget allocation algorithms that estimate Q-factors by independent simulation, we share the sample paths to improve the probability of correctly selecting the best action. Our method can be combined with equal allocation, Successive Rejects, and optimal computing budget allocation to enhance their probabilities of correct selection as well as to achieve better policies in SBPI. Such improvement depends on the overlap in reachable states under different actions. Numerical results show that with such overlap, combining our method with equal allocation, Successive Rejects and optimal computing budget allocation produces higher probability of selection as well as better policies in SBPI.
  • Keywords
    budgeting; discrete event simulation; Q-factors estimation; SBPI; budget allocation algorithm; discrete event dynamic system; optimal computing budget allocation; sample path sharing; simulation-based policy improvement; Aggregates; Computational modeling; Estimation; Optimization; Q-factor; Resource management; Discrete event dynamic system; optimal computing budget allocation; simulation-based optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Robotics and Automation (ICRA), 2014 IEEE International Conference on
  • Conference_Location
    Hong Kong
  • Type

    conf

  • DOI
    10.1109/ICRA.2014.6907332
  • Filename
    6907332