DocumentCode :
3622317
Title :
Q-Learning with Probability Based Action Policy
Author :
Ugurlu; Biricik
Author_Institution :
Bilgisayar Mü
fYear :
2006
fDate :
6/28/1905 12:00:00 AM
Firstpage :
1
Lastpage :
4
Abstract :
In Q-learning, the aim is to reach the goal by using state and action pairs. When the goal is set as a big reward, the optimal path is found as soon as the reward accumulated reaches its highest value. Upon modification of the start and goal points, the information concerning how to reach the goal becomes useless even if the environment does not change. In this study, Q-learning is improved by making the usage of the past data possible. To achieve this, action probabilities for certain start and goal points are found and a neural network is trained with those values to estimate the action probabilities for other start and goal points. A radial basis function network is used as neural network for it can support local representation and can learn fast when there is a few number of inputs. When Q-learning is run with the found action probabilities, an increase in speed is observed in reaching the goal
Keywords :
"Robots","Neural networks","Influenza","Radial basis function networks"
Publisher :
ieee
Conference_Titel :
Signal Processing and Communications Applications, 2006 IEEE 14th
ISSN :
2165-0608
Print_ISBN :
1-4244-0238-7
Type :
conf
DOI :
10.1109/SIU.2006.1659880
Filename :
1659880
Link To Document :
بازگشت