• DocumentCode
    2923957
  • Title

    Distributed reinforcement learning in multi-agent networks

  • Author

    Kar, Soummya ; Moura, Jose M. F. ; Poor, H. Vincent

  • Author_Institution
    Dept. of ECE, Carnegie Mellon Univ., Pittsburgh, PA, USA
  • fYear
    2013
  • fDate
    15-18 Dec. 2013
  • Firstpage
    296
  • Lastpage
    299
  • Abstract
    Distributed reinforcement learning algorithms for collaborative multi-agent Markov decision processes (MDPs) are presented and analyzed. The networked setup consists of a collection of agents (learners) which respond differently (depending on their instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. With the objective of jointly learning the optimal stationary control policy (in the absence of global state transition and local agent cost statistics) that minimizes network-averaged infinite horizon discounted cost, the paper presents distributed variants of Q-learning of the consensus + innovations type in which each agent sequentially refines its learning parameters by locally processing its instantaneous payoff data and the information received from neighboring agents. Under broad conditions on the multi-agent decision model and mean connectivity of the inter-agent communication network, the proposed distributed algorithms are shown to achieve optimal learning asymptotically, i.e., almost surely (a.s.) each network agent is shown to learn the value function and the optimal stationary control policy of the collaborative MDP asymptotically. Further, convergence rate estimates for the proposed class of distributed learning algorithms are obtained.
  • Keywords
    Markov processes; distributed algorithms; learning (artificial intelligence); multi-agent systems; optimal control; telecontrol; Q learning; distributed algorithms; distributed learning algorithms; distributed reinforcement learning; interagent communication network; local agent cost statistics; multiagent Markov decision processes; multiagent networks; Approximation methods; Collaboration; Convergence; Learning (artificial intelligence); Process control; Stochastic processes; Technological innovation; Multi-agent stochastic control; collaborative network processing; consensus + innovations; distributed Q-learning; distributed stochastic approximation; reinforcement learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP), 2013 IEEE 5th International Workshop on
  • Conference_Location
    St. Martin
  • Print_ISBN
    978-1-4673-3144-9
  • Type

    conf

  • DOI
    10.1109/CAMSAP.2013.6714066
  • Filename
    6714066