Title :
Proposal and evaluation of the penalty avoiding rational policy making algorithm with penalty level
Author :
Miyazaki, Kazuteru ; Kojima, Tomomizu ; Kobayashi, Hiroaki
Author_Institution :
Nat. Instn. for Acad. Degrees & Univ. Evaluation, Tokyo
Abstract :
Reinforcement learning (RL) is a kind of machine learning. It aims to adapt an agent to a given environment by utilizing a reward and a penalty. We know the penalty avoiding rational policy making algorithm (PARP) and the penalty avoiding profit sharing (PAPS) as examples of RL systems that are able to suppress a penalty and learn a rational policy. However they cannot treat multiple penalties. In this paper, we extend PARP/PAPS to the environments where there are some kinds of penalties. We propose the penalty avoiding rational policy making algorithm with penalty level (PARPL) that can control how to avoid penalties. We show the effectiveness of PARPL by soccer game simulations.
Keywords :
learning (artificial intelligence); machine learning; penalty avoiding profit sharing; penalty avoiding rational policy making; penalty level; reinforcement learning; Dynamic programming; Electronic mail; Machine learning; Machine learning algorithms; Proposals; Penalty Avoiding Rational Policy Making algorithm; Profit Sharing; Reinforcement Learning; Reward and Penalty; soccer game;
Conference_Titel :
SICE, 2007 Annual Conference
Conference_Location :
Takamatsu
Print_ISBN :
978-4-907764-27-2
Electronic_ISBN :
978-4-907764-27-2
DOI :
10.1109/SICE.2007.4421459