DocumentCode :
2526329
Title :
Diffusion gradient temporal difference for cooperative reinforcement learning with linear function approximation
Author :
Valcarcel Macua, S. ; Belanovic, P. ; Zazo, S.
Author_Institution :
Escuela Tec. Super. de Ing. de Telecomun., Univ. Politec. de Madrid, Madrid, Spain
fYear :
2012
fDate :
28-30 May 2012
Firstpage :
1
Lastpage :
6
Abstract :
We introduce a diffusion-based algorithm in which multiple agents cooperate to predict a common and global state-value function by sharing local estimates and local gradient information among neighbors. Our algorithm is a fully distributed implementation of the gradient temporal difference with linear function approximation, to make it applicable to multiagent settings. Simulations illustrate the benefit of cooperation in learning, as made possible by the proposed algorithm.
Keywords :
approximation theory; gradient methods; learning (artificial intelligence); cooperative reinforcement learning; diffusion based algorithm; diffusion gradient temporal difference; distributed implementation; global state-value function; linear function approximation; local gradient information; multiple agents; Approximation algorithms; Cost function; Function approximation; Learning; Prediction algorithms; Vectors; TD; cooperative learning; distributed control; distributed decision making; distributed reinforcement learning; distributed temporal difference; multiagent;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cognitive Information Processing (CIP), 2012 3rd International Workshop on
Conference_Location :
Baiona
Print_ISBN :
978-1-4673-1877-8
Type :
conf
DOI :
10.1109/CIP.2012.6232901
Filename :
6232901
Link To Document :
بازگشت