• DocumentCode
    2526329
  • Title

    Diffusion gradient temporal difference for cooperative reinforcement learning with linear function approximation

  • Author

    Valcarcel Macua, S. ; Belanovic, P. ; Zazo, S.

  • Author_Institution
    Escuela Tec. Super. de Ing. de Telecomun., Univ. Politec. de Madrid, Madrid, Spain
  • fYear
    2012
  • fDate
    28-30 May 2012
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    We introduce a diffusion-based algorithm in which multiple agents cooperate to predict a common and global state-value function by sharing local estimates and local gradient information among neighbors. Our algorithm is a fully distributed implementation of the gradient temporal difference with linear function approximation, to make it applicable to multiagent settings. Simulations illustrate the benefit of cooperation in learning, as made possible by the proposed algorithm.
  • Keywords
    approximation theory; gradient methods; learning (artificial intelligence); cooperative reinforcement learning; diffusion based algorithm; diffusion gradient temporal difference; distributed implementation; global state-value function; linear function approximation; local gradient information; multiple agents; Approximation algorithms; Cost function; Function approximation; Learning; Prediction algorithms; Vectors; TD; cooperative learning; distributed control; distributed decision making; distributed reinforcement learning; distributed temporal difference; multiagent;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Cognitive Information Processing (CIP), 2012 3rd International Workshop on
  • Conference_Location
    Baiona
  • Print_ISBN
    978-1-4673-1877-8
  • Type

    conf

  • DOI
    10.1109/CIP.2012.6232901
  • Filename
    6232901