• DocumentCode
    2569518
  • Title

    Real-valued Q-learning in multi-agent cooperation

  • Author

    Hwang, Kao-Shing ; Lo, Chia-Yue ; Chen, Kim-Joan

  • fYear
    2009
  • fDate
    11-14 Oct. 2009
  • Firstpage
    395
  • Lastpage
    400
  • Abstract
    In this paper, we propose a Q-learning with continuous action policy and extend this algorithm to a multi-agent system. We examine this algorithm in a task that there are two robots taking action independently but connected with a straight bar. The robots must cooperate to move to the goal and avoid the obstacles in the environment. Conventional Q-learning needs a pre-defined and discrete state space but fails to identify the variances of the different situation in the same state. We introduce a stochastic recording real-valued unit to Q-learning to differentiate the actions corresponding to different state inputs but categorized to the same state. This unit can be regarded as an action evaluation module, which models and produces the expected evaluation signal and an action selection unit that generates an action with the expectation of better performance using a probability distribution function that estimates an optimal action selection policy. The results from both the simulation and experiment demonstrate better performance and applicability of the proposed learning model.
  • Keywords
    learning (artificial intelligence); multi-robot systems; statistical distributions; multi-agent cooperation; optimal action selection policy; probability distribution; real-valued Q-learning; stochastic recording real-valued unit; Cybernetics; Gaussian distribution; Learning; Multiagent systems; Orbital robotics; Probability distribution; Signal generators; State-space methods; Stochastic processes; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on
  • Conference_Location
    San Antonio, TX
  • ISSN
    1062-922X
  • Print_ISBN
    978-1-4244-2793-2
  • Electronic_ISBN
    1062-922X
  • Type

    conf

  • DOI
    10.1109/ICSMC.2009.5346188
  • Filename
    5346188