Title :
Temporal difference learning with eligibility traces for the game connect four
Author :
Thill, Markus ; Bagheri, Saeed ; Koch, Peter ; Konen, Wolfgang
Author_Institution :
Dept. of Comput. Sci., Cologne Univ. of Appl. Sci., Gummersbach, Germany
Abstract :
Systems that learn to play board games are often trained by self-play on the basis of temporal difference (TD) learning. Successful examples include Tesauro´s well known TD-Gammon and Lucas´ Othello agent. For other board games of moderate complexity like Connect Four, we found in previous work that a successful system requires a very rich initial feature set with more than half a million of weights and several millions of training games. In this work we study the benefits of eligibility traces added to this system. To the best of our knowledge, eligibility traces have not been used before for such a large system. Different versions of eligibility traces (standard, resetting, and replacing traces) are compared. We show that eligibility traces speed up the learning by a factor of two and that they increase the asymptotic playing strength.
Keywords :
computer games; learning (artificial intelligence); multi-agent systems; Connect Four game; Lucas Othello agent; TD learning; TD-Gammon agent; asymptotic playing strength; board games; replacing eligibility trace; resetting eligibility trace; standard eligibility trace; temporal difference learning; Coherence;
Conference_Titel :
Computational Intelligence and Games (CIG), 2014 IEEE Conference on
Conference_Location :
Dortmund
DOI :
10.1109/CIG.2014.6932870