DocumentCode
2703131
Title
Convergence behavior of temporal difference learning
Author
Malhotra, Raj P.
Author_Institution
Dept. of Electr. Eng., Dayton Univ., OH, USA
Volume
2
fYear
1996
fDate
20-23 May 1996
Firstpage
887
Abstract
Temporal Difference Learning is an important class of incremental learning procedures which learn to predict outcomes of sequential processes through experience. Although these algorithms have been used in a variety of notorious intelligent systems such as Samuel´s checker-player and Tesauro´s Backgammon program, their convergence properties remain poorly understood. This paper provides a brief summary of the theoretical basis for these algorithms and documents observed convergence performance in a variety of experiments. The implications of these results are also briefly discussed
Keywords
convergence; learning (artificial intelligence); Backgammon program; Samuel; Temporal Difference Learning; Tesauro; checker-player; convergence performance; incremental learning; intelligent systems; sequential processes; Control systems; Convergence; Delay; Feedback; Intelligent systems; Iterative algorithms; Iterative methods; Learning; Pattern recognition; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Aerospace and Electronics Conference, 1996. NAECON 1996., Proceedings of the IEEE 1996 National
Conference_Location
Dayton, OH
ISSN
0547-3578
Print_ISBN
0-7803-3306-3
Type
conf
DOI
10.1109/NAECON.1996.517756
Filename
517756
Link To Document