• DocumentCode
    2703131
  • Title

    Convergence behavior of temporal difference learning

  • Author

    Malhotra, Raj P.

  • Author_Institution
    Dept. of Electr. Eng., Dayton Univ., OH, USA
  • Volume
    2
  • fYear
    1996
  • fDate
    20-23 May 1996
  • Firstpage
    887
  • Abstract
    Temporal Difference Learning is an important class of incremental learning procedures which learn to predict outcomes of sequential processes through experience. Although these algorithms have been used in a variety of notorious intelligent systems such as Samuel´s checker-player and Tesauro´s Backgammon program, their convergence properties remain poorly understood. This paper provides a brief summary of the theoretical basis for these algorithms and documents observed convergence performance in a variety of experiments. The implications of these results are also briefly discussed
  • Keywords
    convergence; learning (artificial intelligence); Backgammon program; Samuel; Temporal Difference Learning; Tesauro; checker-player; convergence performance; incremental learning; intelligent systems; sequential processes; Control systems; Convergence; Delay; Feedback; Intelligent systems; Iterative algorithms; Iterative methods; Learning; Pattern recognition; Yarn;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Aerospace and Electronics Conference, 1996. NAECON 1996., Proceedings of the IEEE 1996 National
  • Conference_Location
    Dayton, OH
  • ISSN
    0547-3578
  • Print_ISBN
    0-7803-3306-3
  • Type

    conf

  • DOI
    10.1109/NAECON.1996.517756
  • Filename
    517756