• DocumentCode
    130207
  • Title

    Temporal difference learning with eligibility traces for the game connect four

  • Author

    Thill, Markus ; Bagheri, Saeed ; Koch, Peter ; Konen, Wolfgang

  • Author_Institution
    Dept. of Comput. Sci., Cologne Univ. of Appl. Sci., Gummersbach, Germany
  • fYear
    2014
  • fDate
    26-29 Aug. 2014
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Systems that learn to play board games are often trained by self-play on the basis of temporal difference (TD) learning. Successful examples include Tesauro´s well known TD-Gammon and Lucas´ Othello agent. For other board games of moderate complexity like Connect Four, we found in previous work that a successful system requires a very rich initial feature set with more than half a million of weights and several millions of training games. In this work we study the benefits of eligibility traces added to this system. To the best of our knowledge, eligibility traces have not been used before for such a large system. Different versions of eligibility traces (standard, resetting, and replacing traces) are compared. We show that eligibility traces speed up the learning by a factor of two and that they increase the asymptotic playing strength.
  • Keywords
    computer games; learning (artificial intelligence); multi-agent systems; Connect Four game; Lucas Othello agent; TD learning; TD-Gammon agent; asymptotic playing strength; board games; replacing eligibility trace; resetting eligibility trace; standard eligibility trace; temporal difference learning; Coherence;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Games (CIG), 2014 IEEE Conference on
  • Conference_Location
    Dortmund
  • Type

    conf

  • DOI
    10.1109/CIG.2014.6932870
  • Filename
    6932870