• DocumentCode
    2002811
  • Title

    Reinforcement learning with particles for instant optimality

  • Author

    Beppu, T. ; Notsu, A. ; Honda, Kazuhiro ; Ichihashi, Hayato

  • Author_Institution
    Osaka Prefecture Univ., Sakai, Japan
  • fYear
    2012
  • fDate
    20-24 Nov. 2012
  • Firstpage
    1528
  • Lastpage
    1533
  • Abstract
    In this paper, we propose a new Actor-Critic method in the agent environment and action space based on the normal Actor-Critic method and PSO. In the algorithm, particles are expressed as cluster center of some states or actions, and explore through the space in order to get an appropriate divided space. The purposes of this study are learning efficiency improvement and heuristic space segmentation. In our method, particles move in the space during the agent´s learning process. Appropriate segmentation can minimize the learning time and enables us to recognize the evolutionary process. Thus, this method is also designed for humanlike decisions in the learning process. The simulation results indicate that our method shows some clusters in the action and state space. Space segmentation, such as group formation, language systems and culture, will be revealed by multi-agent social simulation with our method.
  • Keywords
    learning (artificial intelligence); multi-agent systems; particle swarm optimisation; PSO; action space; actor-critic method; agent environment; agent learning process; evolutionary process; heuristic space segmentation; learning efficiency improvement; learning time; multi-agent social simulation; particle swarm optimization; reinforcement learning; space segmentation; Actor-Critic; PSO; Particles; Reinforcement Learning; Segmentalized space;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing and Intelligent Systems (SCIS) and 13th International Symposium on Advanced Intelligent Systems (ISIS), 2012 Joint 6th International Conference on
  • Conference_Location
    Kobe
  • Print_ISBN
    978-1-4673-2742-8
  • Type

    conf

  • DOI
    10.1109/SCIS-ISIS.2012.6505097
  • Filename
    6505097