• DocumentCode
    3183671
  • Title

    Multi-robot Box-pushing: Single-Agent Q-Learning vs. Team Q-Learning

  • Author

    Wang, Ying ; De Silva, Clarence W.

  • Author_Institution
    Dept. of Mech. Eng., British Columbia Cancer Res. Centre, Vancouver, BC
  • fYear
    2006
  • fDate
    9-15 Oct. 2006
  • Firstpage
    3694
  • Lastpage
    3699
  • Abstract
    In this paper, two types of multi-agent reinforcement learning algorithms are employed in a task of multi-robot box-pushing. The first one is a direct extension of the single-agent Q-learning, which does not have a solid theoretical foundation because it violates the static environment assumption of the Q-learning algorithm. The second one is the Team Qlearning algorithm, which is a multi-agent reinforcement learning algorithm, and is proved to converge to the optimal policy. The states, actions, and reward function of the algorithms are presented in the paper. Based on the two Q-learning algorithms, a fully distributed multi-robot system is developed. Computer simulations are carried out using the developed system. The simulation results show that the two algorithms are effective in a simple environment. It is shown, however, that the single-agent Q-learning algorithm does a better job than the team Q-learning algorithm in a complicated and unknown environment with many obstacles
  • Keywords
    control engineering computing; learning (artificial intelligence); multi-agent systems; multi-robot systems; distributed multi-robot system; multi-agent reinforcement learning algorithms; multi-robot box-pushing; single-agent Q-learning; team Q-learning; Computational modeling; Computer simulation; Intelligent robots; Machine learning algorithms; Mechanical engineering; Multirobot systems; Orbital robotics; Robot kinematics; Solids; Transportation; Box-pushing; Multi-robot Systems; Multiagent Reinforcement Learning; Team Q-learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Robots and Systems, 2006 IEEE/RSJ International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    1-4244-0258-1
  • Electronic_ISBN
    1-4244-0259-X
  • Type

    conf

  • DOI
    10.1109/IROS.2006.281729
  • Filename
    4058979