A Reinforcement Learning Based Algorithm for Finite Horizon Markov Decision Processes

Author

Bhatnagar, Shalabh ; Abdulla, Mohammed Shahid

Author_Institution

Dept. of Comput. Sci. & Autom., Indian Inst. of Sci., Bangalore

fYear

2006

fDate

13-15 Dec. 2006

Firstpage

5519

Lastpage

5524

Abstract

We develop a simulation based algorithm for finite horizon Markov decision processes with finite state and finite action space. Illustrative numerical experiments with the proposed algorithm are shown for problems in flow control of communication networks and capacity switching in semiconductor fabrication

Keywords

Markov processes; decision theory; learning (artificial intelligence); actor-critic algorithms; capacity switching; communication network; finite action space; finite horizon Markov decision process; finite state space; flow control; normalized Hadamard matrix; reinforcement learning; semiconductor fabrication; timescale stochastic approximation; Approximation algorithms; Communication networks; Communication system control; Computational modeling; Convergence; Costs; Learning; Poisson equations; Recursive estimation; Stochastic processes; Finite horizon Markov decision processes; actor-critic algorithms; normalized Hadamard matrices; reinforcement learning; two timescale stochastic approximation;

fLanguage

English

Publisher

ieee

Conference_Titel

Decision and Control, 2006 45th IEEE Conference on

Conference_Location

San Diego, CA

Print_ISBN

1-4244-0171-2

Type

conf

DOI

10.1109/CDC.2006.377190

Filename

4177082