DocumentCode :
2382124
Title :
Reinforcement learning with nonstationary reward depending on the episode
Author :
Shibuya, Takeshi ; Yasunobu, Seiji
Author_Institution :
Grad. Sch. of Syst. & Inf. Eng., Univ. of Tsukuba, Tsukuba, Japan
fYear :
2011
fDate :
9-12 Oct. 2011
Firstpage :
2145
Lastpage :
2150
Abstract :
A model which represents nonstationary reward is proposed for reinforcement learning(RL). RL is a framework that the agent learns by the interaction with an environment. The agent receives the reward, and learns its behavior. The reward is determined by the designer. It is not necessary to design the behavior of the agent so that RL is expected to be applied to various applications. However, conventional RL algorithms work under the assumption that the environment is stationary. In other words, conventional RL can not accept unstationary rewards and the change of the objective. From the point of view of real world applications, it is necessary for the agent to deal with a change of the objective. In this paper, a learning technique to deal with a temporal change of the reward is proposed. In the proposed reward representation, the reward is divided into two parts: episode-dependent part and episode-independent part. The simulation experiments show the effectiveness of the proposed method.
Keywords :
learning (artificial intelligence); agent behavior; episode-independent part; nonstationary reward; reinforcement learning algorithm; reward representation; Convergence; Equations; Learning; Learning systems; Markov processes; Vectors; learning in environment with temporal change of reward; nonstationary reward; reinforcement learning;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics (SMC), 2011 IEEE International Conference on
Conference_Location :
Anchorage, AK
ISSN :
1062-922X
Print_ISBN :
978-1-4577-0652-3
Type :
conf
DOI :
10.1109/ICSMC.2011.6083989
Filename :
6083989
Link To Document :
بازگشت