مرکز منطقه ای اطلاع رساني علوم و فناوري - A new criterion using information gain for action selection strategy in reinforcement learning

DocumentCode :

1026877

Title :

A new criterion using information gain for action selection strategy in reinforcement learning

Author :

Iwata, Kazunori ; Ikeda, Kazushi ; Sakai, Hideaki

Author_Institution :

Graduate Sch. of Informatics, Kyoto Univ., Japan

Volume :

Issue :

fYear :

2004

fDate :

7/1/2004 12:00:00 AM

Firstpage :

792

Lastpage :

799

Abstract :

In this paper, we regard the sequence of returns as outputs from a parametric compound source. Utilizing the fact that the coding rate of the source shows the amount of information about the return, we describe ℓ-learning algorithms based on the predictive coding idea for estimating an expected information gain concerning future information and give a convergence proof of the information gain. Using the information gain, we propose the ratio ω of return loss to information gain as a new criterion to be used in probabilistic action-selection strategies. In experimental results, we found that our ω-based strategy performs well compared with the conventional Q-based strategy.

Keywords :

encoding; learning (artificial intelligence); /spl lscr/-learning algorithms; information gain; predictive coding; probabilistic action-selection strategy; reinforcement learning; source coding rate; Convergence; Educational technology; Encoding; Entropy; Informatics; Learning; Predictive coding; Robot control; Source coding; Uncertainty; Algorithms; Artificial Intelligence; Computer Simulation; Decision Support Techniques; Information Storage and Retrieval; Information Theory; Models, Statistical; Neural Networks (Computer); Probability Learning; Reinforcement (Psychology);

fLanguage :

English

Journal_Title :

Neural Networks, IEEE Transactions on

Publisher :

ieee

ISSN :

1045-9227

Type :

jour

DOI :

10.1109/TNN.2004.828760

Filename :

1310353

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1026877