Why co-evolution beats temporal difference learning at Backgammon for a linear architecture, but not a non-linear architecture

Author

Darwen, Paul J.

Author_Institution

Dept. of Comput. Sci. & Electr. Eng., Queensland Univ., Brisbane, Qld., Australia

Volume

2

fYear

2001

fDate

2001

Firstpage

1003

Abstract

No Free Lunch theorems show that the algorithm must suit the problem. This does not answer the novice´s question: for a given problem, which algorithm to use? This paper compares co-evolutionary learning and temporal difference learning on the game of Backgammon, which (like many real-world tasks) has an element of random uncertainty. Unfortunately, to fully evaluate a single strategy using undirected sampling of board positions, using only random dice rolls, requires a great deal of computation. Evolution´s all-or-nothing replacement of entire solutions needs accurate evaluation, but relatively rare board positions are needed to train above a certain level. Temporal difference learning, with its incremental changes, does not use such an all-or-nothing approach. These results have relevance to a variety of real-world tasks with uncertainty, such as schedule optimization

Keywords

computer games; evolutionary computation; games of skill; learning (artificial intelligence); Backgammon; Free Lunch theorems; all-or-nothing approach; co-evolutionary learning; game; linear architecture; nonlinear architecture; random uncertainty; schedule optimization; temporal difference learning; Cognitive science; Computer architecture; Computer science; Law; Legal factors; Neural networks; Optimal scheduling; Sampling methods; Scheduling algorithm; Uncertainty;

fLanguage

English

Publisher

ieee

Conference_Titel

Evolutionary Computation, 2001. Proceedings of the 2001 Congress on

Conference_Location

Seoul

Print_ISBN

0-7803-6657-3

Type

conf

DOI

10.1109/CEC.2001.934300

Filename

934300