DocumentCode
2526329
Title
Diffusion gradient temporal difference for cooperative reinforcement learning with linear function approximation
Author
Valcarcel Macua, S. ; Belanovic, P. ; Zazo, S.
Author_Institution
Escuela Tec. Super. de Ing. de Telecomun., Univ. Politec. de Madrid, Madrid, Spain
fYear
2012
fDate
28-30 May 2012
Firstpage
1
Lastpage
6
Abstract
We introduce a diffusion-based algorithm in which multiple agents cooperate to predict a common and global state-value function by sharing local estimates and local gradient information among neighbors. Our algorithm is a fully distributed implementation of the gradient temporal difference with linear function approximation, to make it applicable to multiagent settings. Simulations illustrate the benefit of cooperation in learning, as made possible by the proposed algorithm.
Keywords
approximation theory; gradient methods; learning (artificial intelligence); cooperative reinforcement learning; diffusion based algorithm; diffusion gradient temporal difference; distributed implementation; global state-value function; linear function approximation; local gradient information; multiple agents; Approximation algorithms; Cost function; Function approximation; Learning; Prediction algorithms; Vectors; TD; cooperative learning; distributed control; distributed decision making; distributed reinforcement learning; distributed temporal difference; multiagent;
fLanguage
English
Publisher
ieee
Conference_Titel
Cognitive Information Processing (CIP), 2012 3rd International Workshop on
Conference_Location
Baiona
Print_ISBN
978-1-4673-1877-8
Type
conf
DOI
10.1109/CIP.2012.6232901
Filename
6232901
Link To Document