Title :
Convergence behavior of temporal difference learning
Author :
Malhotra, Raj P.
Author_Institution :
Dept. of Electr. Eng., Dayton Univ., OH, USA
Abstract :
Temporal Difference Learning is an important class of incremental learning procedures which learn to predict outcomes of sequential processes through experience. Although these algorithms have been used in a variety of notorious intelligent systems such as Samuel´s checker-player and Tesauro´s Backgammon program, their convergence properties remain poorly understood. This paper provides a brief summary of the theoretical basis for these algorithms and documents observed convergence performance in a variety of experiments. The implications of these results are also briefly discussed
Keywords :
convergence; learning (artificial intelligence); Backgammon program; Samuel; Temporal Difference Learning; Tesauro; checker-player; convergence performance; incremental learning; intelligent systems; sequential processes; Control systems; Convergence; Delay; Feedback; Intelligent systems; Iterative algorithms; Iterative methods; Learning; Pattern recognition; Yarn;
Conference_Titel :
Aerospace and Electronics Conference, 1996. NAECON 1996., Proceedings of the IEEE 1996 National
Conference_Location :
Dayton, OH
Print_ISBN :
0-7803-3306-3
DOI :
10.1109/NAECON.1996.517756