DocumentCode :
1805450
Title :
Infinite-horizon gradient estimation for semi-Markov decision processes
Author :
Li, Yanjie ; Cao, Fang
Author_Institution :
Shenzhen Grad. Sch., Harbin Inst. of Technol., Shenzhen, China
fYear :
2011
fDate :
15-18 May 2011
Firstpage :
926
Lastpage :
931
Abstract :
This paper presents a performance gradient formula for semi-Markov decision processes with average reward criterion. With this formula, we propose an infinite-horizon online (sample-path based) gradient estimation algorithm. This algorithm naturally extend online gradient estimation algorithm for discrete-time Markov systems to continuous time semi-Markov models. In particular, the new algorithm requires less storage than the algorithm appeared in the literature.
Keywords :
Markov processes; continuous time systems; decision theory; discrete time systems; gradient methods; average reward criterion; continuous time semiMarkov models; discrete time Markov systems; infinite horizon online gradient estimation algorithm; semiMarkov decision processes; Algorithm design and analysis; Approximation algorithms; Approximation methods; Equations; Estimation; Markov processes; Optimization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control Conference (ASCC), 2011 8th Asian
Conference_Location :
Kaohsiung
Print_ISBN :
978-1-61284-487-9
Electronic_ISBN :
978-89-956056-4-6
Type :
conf
Filename :
5899196
Link To Document :
بازگشت