مرکز منطقه ای اطلاع رساني علوم و فناوري - Parameterized penalties in the dual representation of Markov decision processes

DocumentCode :

3163642

Title :

Parameterized penalties in the dual representation of Markov decision processes

Author :

Fan Ye ; Enlu Zhou

Author_Institution :

Dept. of Ind. & Enterprise Syst. Eng., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA

fYear :

2012

fDate :

10-13 Dec. 2012

Firstpage :

870

Lastpage :

876

Abstract :

Duality in Markov decision processes (MDPs) has been studied recently by several researchers with the goal to derive dual bounds on the value function. In this paper we propose the idea of using parameterized penalty functions in the dual representation of MDPs, which allows us to integrate different types of penalty functions and guarantees a tighter dual bound with more penalties used. To complement and diversify the existing linear penalties developed in the literature, we also introduce a new class of nonlinear penalties that can be used for a broad class of problems and are also easy to implement in practice. Based on this new class of penalties, our framework of parameterized penalties is a promising method to produce tighter dual bounds than existing duality-based methods. We compare the performance of the dual bounds induced by different penalties on a numerical example, demonstrating the effectiveness of our method.

Keywords :

Markov processes; decision making; duality (mathematics); dynamic programming; MDP; Markov decision process dual-representation; dual bounds; duality-based methods; linear penalties; nonlinear penalties; parameterized penalty functions; tighter dual bound; value function; Aerospace electronics; Approximation methods; Dynamic programming; Linear programming; Markov processes; Optimization; Vectors;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Decision and Control (CDC), 2012 IEEE 51st Annual Conference on

Conference_Location :

Maui, HI

ISSN :

0743-1546

Print_ISBN :

978-1-4673-2065-8

Electronic_ISBN :

0743-1546

Type :

conf

DOI :

10.1109/CDC.2012.6426037

Filename :

6426037

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3163642