• Title of article

    Concurrent Markov decision processes for robot team learning

  • Author/Authors

    Girard، نويسنده , , Justin and Reza Emami، نويسنده , , M.، نويسنده ,

  • Pages
    12
  • From page
    223
  • To page
    234
  • Abstract
    Multi-agent learning, in a decision theoretic sense, may run into deficiencies if a single Markov decision process (MDP) is used to model agent behaviour. This paper discusses an approach to overcoming such deficiencies by considering a multi-agent learning problem as a concurrence between individual learning and task allocation MDPs. This approach, called Concurrent MDP (CMDP), is contrasted with other MDP models, including decentralized MDP. The individual MDP problem is solved by a Q-Learning algorithm, guaranteed to settle on a locally optimal reward maximization policy. For the task allocation MDP, several different concurrent individual and social learning solutions are considered. Through a heterogeneous team foraging case study, it is shown that the CMDP-based learning mechanisms reduce both simulation time and total agent learning effort.
  • Keywords
    robot team , Heterogeneous team , reinforcement learning , Markov decision process , multi-agent learning
  • Journal title
    Astroparticle Physics
  • Record number

    2048658