• DocumentCode
    493359
  • Title

    A unified framework for temporal difference methods

  • Author

    Bertsekas, Dimitri P.

  • Author_Institution
    Lab. for Inf. & Decision Syst. (LIDS), Massachusetts Inst. of Technol., Cambridge, MA
  • fYear
    2009
  • fDate
    March 30 2009-April 2 2009
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    We propose a unified framework for a broad class of methods to solve projected equations that approximate the solution of a high-dimensional fixed point problem within a subspace S spanned by a small number of basis functions or features. These methods originated in approximate dynamic programming (DP), where they are collectively known as temporal difference (TD) methods. Our framework is based on a connection with projection methods for monotone variational inequalities, which involve alternative representations of the subspace S (feature scaling). Our methods admit simulation-based implementations, and even when specialized to DP problems, include extensions/new versions of the standard TD algorithms, which offer some special implementation advantages and reduced overhead.
  • Keywords
    approximation theory; dynamic programming; approximate dynamic programming; high-dimensional fixed point problem; monotone variational inequalities; temporal difference methods; Books; Costs; Difference equations; Dynamic programming; Jacobian matrices; Laboratories; Least squares approximation; Least squares methods; Linear matrix inequalities; Probability distribution;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Adaptive Dynamic Programming and Reinforcement Learning, 2009. ADPRL '09. IEEE Symposium on
  • Conference_Location
    Nashville, TN
  • Print_ISBN
    978-1-4244-2761-1
  • Type

    conf

  • DOI
    10.1109/ADPRL.2009.4927518
  • Filename
    4927518