DocumentCode :
2295159
Title :
Using suitable action selection rule in reinforcement learning
Author :
Ohta, Masayuki ; Kumada, Yoichiro ; Noda, Itsuki
Author_Institution :
Cyber Assist Res. Center, National Inst. of Adv. Industrial Sci. & Technol., Tokyo, Japan
Volume :
5
fYear :
2003
fDate :
5-8 Oct. 2003
Firstpage :
4358
Abstract :
Reinforcement learning under perceptual aliasing causes the following serious trade-off on action selection roles. When an agent decides its action by a deterministic selection rule according to action values, acquired policies may include infinite loops. On the other hand, while a stochastic selection rule can avoid such infinite loops, it reduces the efficiency of the acquired policy. To resolve this trade-off, we propose a method called "adaptive meta-selection", by which agents can determine a better selection rule between deterministic and stochastic ones for each state. Results of experiments in Pursuit Game show that proposed method enables an agent to acquire a policy that has efficient performance while excluding infinite loops.
Keywords :
learning (artificial intelligence); multi-agent systems; probability; Pursuit Game; action selection rule; adaptive meta-selection; deterministic selection rule; perceptual aliasing; reinforcement learning; stochastic selection rule; Environmental management; Information science; Learning; Noise robustness; Stochastic processes; Stochastic resonance; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2003. IEEE International Conference on
ISSN :
1062-922X
Print_ISBN :
0-7803-7952-7
Type :
conf
DOI :
10.1109/ICSMC.2003.1245670
Filename :
1245670
Link To Document :
بازگشت