مرکز منطقه ای اطلاع رساني علوم و فناوري - Self learning control of constrained Markov chains

DocumentCode :

391314

Title :

Self learning control of constrained Markov chains - a gradient approach

Author :

Abad, Felisa Vázquez ; Krishnamurthy, Vikram ; Martin, Katerine ; Baltcheva, Eina

Author_Institution :

Dept. d´´Inf. et de Recherche Oper., Montreal Univ., Que., Canada

Volume :

fYear :

2002

fDate :

10-13 Dec. 2002

Firstpage :

1940

Abstract :

We present stochastic approximation algorithms for computing the locally optimal policy of a constrained average cost finite state Markov decision process. The stochastic approximation algorithms require computation of the gradient of the cost function with respect to the parameter that characterizes the randomized policy. This is computed by simulation based gradient estimation schemes involving weak derivatives. Similar to neuro-dynamic programming algorithms (e.g. Q-learning or temporal difference methods), the algorithms proposed in the paper are simulation based and do not require explicit knowledge of the underlying parameters such as transition probabilities. However, unlike neuro-dynamic programming methods, the algorithms proposed can handle constraints and time varying parameters. The multiplier based constrained stochastic gradient algorithm proposed is also of independent interest in stochastic approximation.

Keywords :

Markov processes; approximation theory; decision theory; gradient methods; learning systems; self-adjusting systems; constrained Markov chains; constrained average cost finite state Markov decision process; gradient approach; gradient estimation schemes; locally optimal policy; self learning control; stochastic approximation algorithms; time varying parameters; weak derivatives; Approximation algorithms; Australia Council; Computational modeling; Cost function; Kernel; Neurodynamics; Optimal control; State-space methods; Stochastic processes; Telecommunication control;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Decision and Control, 2002, Proceedings of the 41st IEEE Conference on

ISSN :

0191-2216

Print_ISBN :

0-7803-7516-5

Type :

conf

DOI :

10.1109/CDC.2002.1184811

Filename :

1184811

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=391314