مرکز منطقه ای اطلاع رساني علوم و فناوري - Greedy exploration policy of Q-learning based on state balance

DocumentCode :

3383072

Title :

Greedy exploration policy of Q-learning based on state balance

Author :

Zheng, Yu ; Luo, Siwei ; Zhang, Jing

Author_Institution :

Sch. of Comput. & Inf. Technol., Beijing Jiaotong Univ., Beijing

fYear :

2005

fDate :

21-24 Nov. 2005

Firstpage :

Lastpage :

Abstract :

Q-learning is one of the successfully established algorithms for the reinforcement learning, which has been widely used to the intelligent control system, such as the control of robot pose. However, curse of dimensionality and difficulty in convergence exist in Q-learning arising from random exploration policy. In this paper, we propose a greedy exploration policy of Q-learning with rule guidance. This exploration policy can reduce the non-optimal action exploration as more as possible, and speed up the convergence of Q-learning. Simulation results indicate the effectiveness of the proposed method.

Keywords :

learning (artificial intelligence); Q-learning; greedy exploration policy; intelligent control system; nonoptimal action exploration; reinforcement learning; rule guidance; state balance; Acceleration; Computational modeling; Control systems; Electronic mail; Information technology; Learning; Optimal control; Robot control; State estimation; State-space methods;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

TENCON 2005 2005 IEEE Region 10

Conference_Location :

Melbourne, Qld.

Print_ISBN :

0-7803-9311-2

Electronic_ISBN :

0-7803-9312-0

Type :

conf

DOI :

10.1109/TENCON.2005.300987

Filename :

4085232

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3383072