DocumentCode :
664235
Title :
Computing monotone policies for Markov decision processes by exploiting sparsity
Author :
Krishnamurthy, Vikram ; Rojas, Cristian R. ; Wahlberg, Bo
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of British Columbia, Vancouver, BC, Canada
fYear :
2013
fDate :
4-5 Nov. 2013
Firstpage :
1
Lastpage :
6
Abstract :
This paper considers Markov decision processes whose optimal policy is a randomized mixture of monotone increasing policies. Such monotone policies have an inherent sparsity structure. We present a two-stage convex optimization algorithm for computing the optimal policy that exploits the sparsity. It combines an alternating direction method of multipliers (ADMM) to solve a linear programming problem with respect to the joint action state probabilities, together with a subgradient step that promotes the monotone sparsity pattern in the conditional probabilities of the action given the state. In the second step, sum-of-norms regularization is used to stress the monotone structure of the optimal policy.
Keywords :
Markov processes; convex programming; gradient methods; iterative methods; linear programming; probability; ADMM; Markov decision processes; alternating direction method of multipliers; conditional probabilities; joint action state probabilities; linear programming problem; monotone sparsity pattern; optimal policy; randomized monotone policy mixture; sparsity structure; subgradient step; sum-of-norm regularization; two-stage convex optimization algorithm; Convergence; Fading; Joints; Linear programming; Markov processes; Optimization; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control Conference (AUCC), 2013 3rd Australian
Conference_Location :
Fremantle, WA
Print_ISBN :
978-1-4799-2497-4
Type :
conf
DOI :
10.1109/AUCC.2013.6697239
Filename :
6697239
Link To Document :
بازگشت