Anticipative reinforcement learning

Author

Maire, Frederic

Author_Institution

Sch. of Comput. Sci. & Software Eng., Queensland Univ. of Technol., Brisbane, Qld., Australia

Volume

3

fYear

2002

fDate

18-22 Nov. 2002

Firstpage

1428

Abstract

This paper introduces anticipative reinforcement learning (ARL), a method that addresses the problem of the breakdown of value based algorithms for problems with small time steps and continuous action and state spaces when the algorithms are implemented with neural networks. In ARL, an agent is made of three components; the actor, the critic and the model (the model is as in Dyna but we use it differently). The main originality of ARL lies in the action selection process; the agent builds a set of candidate actions that includes the action recommended by the actor plus some random actions. Once the set of candidate actions is built, the candidate actions are ranked by considering what would happen if these actions were taken and followed by a sequence of actions using only the current policy (anticipation using iteratively the model with a finite look-ahead). We demonstrate the benefits of looking ahead with experiments on a Khepera robot.

Keywords

function approximation; generalisation (artificial intelligence); learning (artificial intelligence); mobile robots; neural nets; state-space methods; Khepera robot; anticipative reinforcement learning; function approximation; generalisation; neural networks; state spaces; Australia; Books; Electric breakdown; Laboratories; Learning; Robots; Software algorithms; Software engineering; Space technology; State-space methods;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Information Processing, 2002. ICONIP '02. Proceedings of the 9th International Conference on

Print_ISBN

981-04-7524-1

Type

conf

DOI

10.1109/ICONIP.2002.1202856

Filename

1202856