مرکز منطقه ای اطلاع رساني علوم و فناوري - Diffusion gradient temporal difference for cooperative reinforcement learning with linear function approximation

DocumentCode :

2526329

Title :

Diffusion gradient temporal difference for cooperative reinforcement learning with linear function approximation

Author :

Valcarcel Macua, S. ; Belanovic, P. ; Zazo, S.

Author_Institution :

Escuela Tec. Super. de Ing. de Telecomun., Univ. Politec. de Madrid, Madrid, Spain

fYear :

2012

fDate :

28-30 May 2012

Firstpage :

Lastpage :

Abstract :

We introduce a diffusion-based algorithm in which multiple agents cooperate to predict a common and global state-value function by sharing local estimates and local gradient information among neighbors. Our algorithm is a fully distributed implementation of the gradient temporal difference with linear function approximation, to make it applicable to multiagent settings. Simulations illustrate the benefit of cooperation in learning, as made possible by the proposed algorithm.

Keywords :

approximation theory; gradient methods; learning (artificial intelligence); cooperative reinforcement learning; diffusion based algorithm; diffusion gradient temporal difference; distributed implementation; global state-value function; linear function approximation; local gradient information; multiple agents; Approximation algorithms; Cost function; Function approximation; Learning; Prediction algorithms; Vectors; TD; cooperative learning; distributed control; distributed decision making; distributed reinforcement learning; distributed temporal difference; multiagent;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Cognitive Information Processing (CIP), 2012 3rd International Workshop on

Conference_Location :

Baiona

Print_ISBN :

978-1-4673-1877-8

Type :

conf

DOI :

10.1109/CIP.2012.6232901

Filename :

6232901

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2526329