A connectionist actor-critic algorithm for faster learning and biological plausibility

Author

Johard, Leonard ; Ruffaldi, Emanuele

Author_Institution

Scuola Superiore S. Anna, PERCRO, Pisa, Italy

fYear

2014

fDate

May 31 2014-June 7 2014

Firstpage

3903

Lastpage

3909

Abstract

We propose a novel biologically plausible actor-critic algorithm using policy gradients in order to achieve practical, model-free reinforcement learning. It does not rely on backpropagation and is the first neural actor-critic relying only on locally available information. We show it has an advantage over pure policy gradients methods for motor learning performance in the polecart problem. We are also able to closely simulate the dopaminergic signaling patterns in rats when confronted with a two cue problem, showing that local, connectionist models can effectively model the functioning of the intrinsic reward system.

Keywords

biology computing; gradient methods; learning (artificial intelligence); neural nets; biologically plausible actor-critic algorithm; connectionist actor-critic algorithm; dopaminergic signaling patterns; intrinsic reward system; model-free reinforcement learning; neural actor-critic; polecart problem; policy gradients; Backpropagation; Biological system modeling; Learning (artificial intelligence); Neurons; Supervised learning; Training;

fLanguage

English

Publisher

ieee

Conference_Titel

Robotics and Automation (ICRA), 2014 IEEE International Conference on

Conference_Location

Hong Kong

Type

conf

DOI

10.1109/ICRA.2014.6907425

Filename

6907425