Title :
A unified framework for temporal difference methods
Author :
Bertsekas, Dimitri P.
Author_Institution :
Lab. for Inf. & Decision Syst. (LIDS), Massachusetts Inst. of Technol., Cambridge, MA
fDate :
March 30 2009-April 2 2009
Abstract :
We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.
Keywords :
approximation theory; dynamic programming; approximate dynamic programming; high-dimensional fixed point problem; monotone variational inequalities; temporal difference methods; Books; Costs; Difference equations; Dynamic programming; Jacobian matrices; Laboratories; Least squares approximation; Least squares methods; Linear matrix inequalities; Probability distribution;
Conference_Titel :
Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on
Conference_Location :
Nashville, TN
Print_ISBN :
978-1-4244-2761-1
DOI :
10.1109/ADPRL.2009.4927518