• DocumentCode
    1580836
  • Title

    Learning to Reach Optimal Equilibrium by Influence of Other Agents Opinion

  • Author

    Barrios-Aranibar, Dennis ; Goncalves, Luiz M. G.

  • Author_Institution
    Fed. Univ. of Rio Grande do Norte, Natal
  • fYear
    2007
  • Firstpage
    198
  • Lastpage
    203
  • Abstract
    In this work authors extend the model of the reinforcement learning paradigm for multi-agent systems called "influence value reinforcement learning " (IVRL). In previous work an algorithm for repetitive games was proposed, and it outperformed traditional paradigms. Here, authors define an algorithm based on this paradigm for using when agents has to learn from delayed rewards, thus, an influence value reinforcement learning algorithm for two agents stochastic games. The IVRLparadigm is based on social interaction of people, specially in the fact that people communicate each other what they think about their actions and this opinion has some influence in the behavior of each other. A modified version of Q-learning algorithm using this paradigm was constructed. The so called TV Q-learning algorithm was implemented and compared with versions of Q-learning for independent learning and joint action learning. Our approach shows to have more probability to converge to an optimal equilibrium than IQ-learning and JAQ-learning algorithms, specially when exploration increases.
  • Keywords
    learning (artificial intelligence); multi-robot systems; stochastic games; IQ-learning algorithm; JAQ-learning algorithms; TV Q-learning algorithm; influence value reinforcement learning; multiagent systems; stochastic games; Automation; Collaborative work; Delay; Game theory; Hybrid intelligent systems; Learning; Multiagent systems; Nash equilibrium; Stochastic processes; Testing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Hybrid Intelligent Systems, 2007. HIS 2007. 7th International Conference on
  • Conference_Location
    Kaiserlautern
  • Print_ISBN
    978-0-7695-2946-2
  • Type

    conf

  • DOI
    10.1109/HIS.2007.61
  • Filename
    4344051