مرکز منطقه ای اطلاع رساني علوم و فناوري - The Knowledge Gradient Policy for Offline Learning with Independent Normal Rewards

DocumentCode :

2717368

Title :

The Knowledge Gradient Policy for Offline Learning with Independent Normal Rewards

Author :

Frazier, Peter ; Powell, Warren

Author_Institution :

Dept. of Operations Res. & Financial Eng., Princeton Univ., NJ

fYear :

2007

fDate :

1-5 April 2007

Firstpage :

143

Lastpage :

150

Abstract :

We define a new type of policy, the knowledge gradient policy, in the context of an offline learning problem. We show how to compute the knowledge gradient policy efficiently and demonstrate through Monte Carlo simulations that it performs as well or better than a number of existing learning policies

Keywords :

Monte Carlo methods; gradient methods; learning systems; operations research; Monte Carlo simulations; independent normal rewards; knowledge gradient policy; offline learning; Bandwidth; Bayesian methods; Dynamic programming; Knowledge engineering; Learning; Mirrors; Operations research; Performance evaluation; Response surface methodology; Time measurement;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on

Conference_Location :

Honolulu, HI

Print_ISBN :

1-4244-0706-0

Type :

conf

DOI :

10.1109/ADPRL.2007.368181

Filename :

4220826

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2717368