مرکز منطقه ای اطلاع رساني علوم و فناوري - Active exploration by searching for experiments that falsify the computed control policy

DocumentCode :

2498243

Title :

Active exploration by searching for experiments that falsify the computed control policy

Author :

Fonteneau, Raphael ; Murphy, Susan A. ; Wehenkel, Louis ; Ernst, Damien

Author_Institution :

Dept. of Electr. Eng. & Comput. Sci., Univ. of Liege, Liège, Belgium

fYear :

2011

fDate :

11-15 April 2011

Firstpage :

Lastpage :

Abstract :

We propose a strategy for experiment selection - in the context of reinforcement learning - based on the idea that the most interesting experiments to carry out at some stage are those that are the most liable to falsify the current hypothesis about the optimal control policy. We cast this idea in a context where a policy learning algorithm and a model identification method are given a priori. Experiments are selected if, using the learnt environment model, they are predicted to yield a revision of the learnt control policy. Algorithms and simulation results are provided for a deterministic system with discrete action space. They show that the proposed approach is promising.

Keywords :

identification; learning (artificial intelligence); optimal control; active exploration; computed control policy; discrete action space; experiment selection; learnt control policy; model identification method; optimal control policy; policy learning algorithm; reinforcement learning; Approximation algorithms; Approximation methods; Heuristic algorithms; Inference algorithms; Optimal control; Prediction algorithms; Predictive models;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2011 IEEE Symposium on

Conference_Location :

Paris

Print_ISBN :

978-1-4244-9887-1

Type :

conf

DOI :

10.1109/ADPRL.2011.5967364

Filename :

5967364

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2498243