• DocumentCode
    1128298
  • Title

    Sequential Q -Learning With Kalman Filtering for Multirobot Cooperative Transportation

  • Author

    Wang, Ying ; De Silva, Clarence W.

  • Author_Institution
    Dept. of Mech. Eng., Univ. of British Columbia, Vancouver, BC, Canada
  • Volume
    15
  • Issue
    2
  • fYear
    2010
  • fDate
    4/1/2010 12:00:00 AM
  • Firstpage
    261
  • Lastpage
    268
  • Abstract
    This paper presents a modified, distributed Q-learning algorithm, termed as sequential Q-learning with Kalman filtering (SQKF), for decision making associated with multirobot cooperation. The SQKF algorithm developed here has the following characteristics. 1) The learning process is arranged in a sequential manner (i.e., the robots will not make decisions simultaneously, but in a predefined sequence) so as to promote cooperation among robots and reduce their Q-learning spaces. 2) A robot will not update its Q-values with observed global rewards. Instead, it will employ a specific Kalman filter to extract its real local reward from the global reward, thereby updating its Q-table with this local reward. The new SQKF algorithm is intended to solve two problems in multirobot Q-learning: credit assignment and behavior conflicts. The detailed procedure of the SQKF algorithm is presented, and its application is illustrated using a prototype multirobot experimental system. The experimental results show that the algorithm has better performance than the conventional single-agent Q-learning algorithm or the team Q-learning algorithm in the multirobot domain.
  • Keywords
    Kalman filters; decision making; feature extraction; learning (artificial intelligence); multi-robot systems; behavior conflicts; credit assignment; decision making; multirobot cooperative transportation; sequential Q-learning with Kalman filtering; $Q$-learning; Decision making; multirobot systems;
  • fLanguage
    English
  • Journal_Title
    Mechatronics, IEEE/ASME Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4435
  • Type

    jour

  • DOI
    10.1109/TMECH.2009.2024681
  • Filename
    5159474