DocumentCode
493359
Title
A unified framework for temporal difference methods
Author
Bertsekas, Dimitri P.
Author_Institution
Lab. for Inf. & Decision Syst. (LIDS), Massachusetts Inst. of Technol., Cambridge, MA
fYear
2009
fDate
March 30 2009-April 2 2009
Firstpage
1
Lastpage
7
Abstract
We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.
Keywords
approximation theory; dynamic programming; approximate dynamic programming; high-dimensional fixed point problem; monotone variational inequalities; temporal difference methods; Books; Costs; Difference equations; Dynamic programming; Jacobian matrices; Laboratories; Least squares approximation; Least squares methods; Linear matrix inequalities; Probability distribution;
fLanguage
English
Publisher
ieee
Conference_Titel
Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on
Conference_Location
Nashville, TN
Print_ISBN
978-1-4244-2761-1
Type
conf
DOI
10.1109/ADPRL.2009.4927518
Filename
4927518
Link To Document