مرکز منطقه ای اطلاع رساني علوم و فناوري - Convergence behavior of temporal difference learning

DocumentCode :

2703131

Title :

Convergence behavior of temporal difference learning

Author :

Malhotra, Raj P.

Author_Institution :

Dept. of Electr. Eng., Dayton Univ., OH, USA

Volume :

fYear :

1996

fDate :

20-23 May 1996

Firstpage :

887

Abstract :

Temporal Difference Learning is an important class of incremental learning procedures which learn to predict outcomes of sequential processes through experience. Although these algorithms have been used in a variety of notorious intelligent systems such as Samuel´s checker-player and Tesauro´s Backgammon program, their convergence properties remain poorly understood. This paper provides a brief summary of the theoretical basis for these algorithms and documents observed convergence performance in a variety of experiments. The implications of these results are also briefly discussed

Keywords :

convergence; learning (artificial intelligence); Backgammon program; Samuel; Temporal Difference Learning; Tesauro; checker-player; convergence performance; incremental learning; intelligent systems; sequential processes; Control systems; Convergence; Delay; Feedback; Intelligent systems; Iterative algorithms; Iterative methods; Learning; Pattern recognition; Yarn;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Aerospace and Electronics Conference, 1996. NAECON 1996., Proceedings of the IEEE 1996 National

Conference_Location :

Dayton, OH

ISSN :

0547-3578

Print_ISBN :

0-7803-3306-3

Type :

conf

DOI :

10.1109/NAECON.1996.517756

Filename :

517756

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2703131