Performance analysis of a new updating rule for TD(λ) learning in feedforward networks for position evaluation in Go game

Author

Chan, Horace Wai-kit ; King, Irwin ; Lui, John C S

Author_Institution

Dept. of Comput. Sci. & Eng., Chinese Univ. of Hong Kong, Shatin, Hong Kong

Volume

fYear

1996

fDate

3-6 Jun 1996

Firstpage

1716

Abstract

In this paper, a new updating rule for applying temporal difference (TD) learning to multilayer feedforward networks is derived. Networks are trained to evaluate Go board positions by TD(λ) learning with different values of λ. Performance of each network is estimated by letting it play against other networks. Results show that nonzero λ gives better learning for the network and statistically, larger λ gives better performance

Keywords

feedforward neural nets; games of skill; learning (artificial intelligence); multilayer perceptrons; temporal reasoning; Go game; TD(λ) learning; multilayer feedforward networks; neural nets; performance analysis; position evaluation; temporal difference learning; updating rule; Backpropagation; Computer science; Delay; Design engineering; Humans; Intelligent networks; Knowledge engineering; Neural networks; Performance analysis; Supervised learning;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Networks, 1996., IEEE International Conference on

Conference_Location

Washington, DC

Print_ISBN

0-7803-3210-5

Type

conf

DOI

10.1109/ICNN.1996.549159

Filename

549159

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=303422