مرکز منطقه ای اطلاع رساني علوم و فناوري - Q-Learning with Probability Based Action Policy

DocumentCode :

3622317

Title :

Q-Learning with Probability Based Action Policy

Author :

Ugurlu; Biricik

Author_Institution :

Bilgisayar Mü

fYear :

2006

fDate :

6/28/1905 12:00:00 AM

Firstpage :

Lastpage :

Abstract :

In Q-learning, the aim is to reach the goal by using state and action pairs. When the goal is set as a big reward, the optimal path is found as soon as the reward accumulated reaches its highest value. Upon modification of the start and goal points, the information concerning how to reach the goal becomes useless even if the environment does not change. In this study, Q-learning is improved by making the usage of the past data possible. To achieve this, action probabilities for certain start and goal points are found and a neural network is trained with those values to estimate the action probabilities for other start and goal points. A radial basis function network is used as neural network for it can support local representation and can learn fast when there is a few number of inputs. When Q-learning is run with the found action probabilities, an increase in speed is observed in reaching the goal

Keywords :

"Robots","Neural networks","Influenza","Radial basis function networks"

Publisher :

ieee

Conference_Titel :

Signal Processing and Communications Applications, 2006 IEEE 14th

ISSN :

2165-0608

Print_ISBN :

1-4244-0238-7

Type :

conf

DOI :

10.1109/SIU.2006.1659880

Filename :

1659880

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3622317