Title :
Using suitable action selection rule in reinforcement learning
Author :
Ohta, Masayuki ; Kumada, Yoichiro ; Noda, Itsuki
Author_Institution :
Cyber Assist Res. Center, National Inst. of Adv. Industrial Sci. & Technol., Tokyo, Japan
Abstract :
Reinforcement learning under perceptual aliasing causes the following serious trade-off on action selection roles. When an agent decides its action by a deterministic selection rule according to action values, acquired policies may include infinite loops. On the other hand, while a stochastic selection rule can avoid such infinite loops, it reduces the efficiency of the acquired policy. To resolve this trade-off, we propose a method called "adaptive meta-selection", by which agents can determine a better selection rule between deterministic and stochastic ones for each state. Results of experiments in Pursuit Game show that proposed method enables an agent to acquire a policy that has efficient performance while excluding infinite loops.
Keywords :
learning (artificial intelligence); multi-agent systems; probability; Pursuit Game; action selection rule; adaptive meta-selection; deterministic selection rule; perceptual aliasing; reinforcement learning; stochastic selection rule; Environmental management; Information science; Learning; Noise robustness; Stochastic processes; Stochastic resonance; Working environment noise;
Conference_Titel :
Systems, Man and Cybernetics, 2003. IEEE International Conference on
Print_ISBN :
0-7803-7952-7
DOI :
10.1109/ICSMC.2003.1245670