A comparative study of policies in Q-learning for foraging tasks

Author

Mohan, Yogeswaran ; Ponnambalam, S.G. ; Inayat-Hussain, Jawaid I.

Author_Institution

Sch. of Eng., Monash Univ., Petaling Jaya, Malaysia

fYear

2009

fDate

9-11 Dec. 2009

Firstpage

134

Lastpage

139

Abstract

Q-learning is a machine learning technique that learns what to do and how to map states to actions to maximize rewards. Q-learning has been applied to various tasks such as foraging, soccer and prey-pursuing robots. In this paper, a simple foraging task has been considered to study the influences of the policies reported in the open literatures. A mobile robot is used to search and retrieve pucks back to a home location. The goal of this study is to identify an efficient policy for q-learning which maximizes the number of pucks collected and minimizes the number of collisions in the environment. Policies namely greedy, epsilon-greedy, Boltzmann distribution and random search are used to study their performances in the foraging task and the results are presented.

Keywords

learning (artificial intelligence); mobile robots; random processes; search problems; Boltzmann distribution; Q-learning; epsilon-greedy policy; foraging task; machine learning; mobile robot; random search; Boltzmann distribution; Convergence; Machine learning; Mechanical engineering; Mobile robots; Scattering; Testing; exploration-exploitation; foraging; machine learning; mobile-robot; policy; q-learning; reinforcement learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Nature & Biologically Inspired Computing, 2009. NaBIC 2009. World Congress on

Conference_Location

Coimbatore

Print_ISBN

978-1-4244-5053-4

Type

conf

DOI

10.1109/NABIC.2009.5393616

Filename

5393616