مرکز منطقه ای اطلاع رساني علوم و فناوري - Reinforcement learning for model building and variance-penalized control

DocumentCode :

1817265

Title :

Reinforcement learning for model building and variance-penalized control

Author :

Gosavi, Abhijit

Author_Institution :

Dept. of Eng. Manage. & Syst. Eng., Missouri Univ. of Sci. & Technol., Rolla, MO, USA

fYear :

2009

fDate :

13-16 Dec. 2009

Firstpage :

373

Lastpage :

379

Abstract :

Reinforcement learning (RL) is a simulation-based technique to solve Markov decision problems or processes (MDPs). It is especially useful if the transition probabilities in the MDP are hard to find or if the number of states in the problem is too large. In this paper, we present a new model-based RL algorithm that builds the transition probability model without the generation of the transition probabilities; the literature on model-based RL attempts to compute the transition probabilities. We also present a variance-penalized Bellman equation and an RL algorithm that uses it to solve a variance-penalized MDP. We conclude with some numerical experiments with these algorithms.

Keywords :

Markov processes; learning (artificial intelligence); probability; simulation; Markov decision problems; Markov decision processes; RL algorithm; model building; reinforcement learning; simulation-based technique; transition probability; variance-penalized Bellman equation; variance-penalized control; Artificial neural networks; Bayesian methods; Computer networks; Dynamic programming; Equations; Function approximation; Learning; Modeling; Research and development management; Systems engineering and theory;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Simulation Conference (WSC), Proceedings of the 2009 Winter

Conference_Location :

Austin, TX

Print_ISBN :

978-1-4244-5770-0

Type :

conf

DOI :

10.1109/WSC.2009.5429344

Filename :

5429344

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1817265