• DocumentCode
    529353
  • Title

    Automatic tuning of judgement parameter in continuous state exploitation-oriented learning

  • Author

    Miyazaki, Kazuteru

  • Author_Institution
    Dept. of Assessment & Res. for Degree Awarding, Univ. Evaluation, Tokyo, Japan
  • fYear
    2010
  • fDate
    18-21 Aug. 2010
  • Firstpage
    3246
  • Lastpage
    3249
  • Abstract
    The rational policy making algorithm (PPM) and the penalty avoiding rational policy making algorithm (PARP) under continuous state spaces has important parameter that decides the same of basic functions. It is necessary to set an appropriate value through a preliminary experiment. In this paper, we propose an automatic tuning mechanism of the judgement parameter. We show the effectiveness of our proposal using a pole-cart problem.
  • Keywords
    decision making; learning (artificial intelligence); automatic tuning mechanism; continuous state exploitation oriented learning; judgement parameter; penalty avoiding rational policy making algorithm; pole-cart problem; Algorithm design and analysis; Learning; Machine learning; Markov processes; Proposals; Tuning; Continuous State Spaces; Exploitation-oriented Learning XoL; PARP; RPM; Reinforcement Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    SICE Annual Conference 2010, Proceedings of
  • Conference_Location
    Taipei
  • Print_ISBN
    978-1-4244-7642-8
  • Type

    conf

  • Filename
    5602589