DocumentCode
529353
Title
Automatic tuning of judgement parameter in continuous state exploitation-oriented learning
Author
Miyazaki, Kazuteru
Author_Institution
Dept. of Assessment & Res. for Degree Awarding, Univ. Evaluation, Tokyo, Japan
fYear
2010
fDate
18-21 Aug. 2010
Firstpage
3246
Lastpage
3249
Abstract
The rational policy making algorithm (PPM) and the penalty avoiding rational policy making algorithm (PARP) under continuous state spaces has important parameter that decides the same of basic functions. It is necessary to set an appropriate value through a preliminary experiment. In this paper, we propose an automatic tuning mechanism of the judgement parameter. We show the effectiveness of our proposal using a pole-cart problem.
Keywords
decision making; learning (artificial intelligence); automatic tuning mechanism; continuous state exploitation oriented learning; judgement parameter; penalty avoiding rational policy making algorithm; pole-cart problem; Algorithm design and analysis; Learning; Machine learning; Markov processes; Proposals; Tuning; Continuous State Spaces; Exploitation-oriented Learning XoL; PARP; RPM; Reinforcement Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
SICE Annual Conference 2010, Proceedings of
Conference_Location
Taipei
Print_ISBN
978-1-4244-7642-8
Type
conf
Filename
5602589
Link To Document