DocumentCode
2825307
Title
Distributed optimization of markov reward processes
Author
ñez, Enrique Campos Ná
Author_Institution
George Washington Univ., Washington
fYear
2007
fDate
12-14 Dec. 2007
Firstpage
6166
Lastpage
6171
Abstract
Dynamic programming provides perhaps the most natural way to model many control problems, but suffers from the fact that existing solution procedures do not scale gracefully with the size of the problem. In this work, we present a gradient- based policy search technique that exploits the fact that in many applications the state space and control actions are naturally distributed. After presenting our modeling assumptions, we introduce a technique in which a set of distributed agents compute an estimate of the partial derivative of a system-wide objective with respect to the parameters under their control and use it in a gradient-based policy search procedure. We illustrate the algorithm with an application to energy-efficient coverage in energy harvesting sensor networks. The resulting algorithm can be implemented using only local information available to the sensors, and is therefore fully scalable. Our numerical results are encouraging and allow us to conjecture the usefulness of our approach.
Keywords
decentralised control; dynamic programming; gradient methods; state-space methods; stochastic systems; Markov reward process; distributed optimization; dynamic programming; gradient-based policy search technique; state-space method; Control systems; Decision making; Distributed computing; Dynamic programming; Energy efficiency; Modeling; Resource management; Size control; State-space methods; Stochastic systems;
fLanguage
English
Publisher
ieee
Conference_Titel
Decision and Control, 2007 46th IEEE Conference on
Conference_Location
New Orleans, LA
ISSN
0191-2216
Print_ISBN
978-1-4244-1497-0
Electronic_ISBN
0191-2216
Type
conf
DOI
10.1109/CDC.2007.4434649
Filename
4434649
Link To Document