• DocumentCode
    3497910
  • Title

    Monte-Carlo Go Reinforcement Learning Experiments

  • Author

    Bouzy, Bruno ; Chaslot, Guillaume

  • Author_Institution
    UFR de mathematiques et d´´informatique, Univ. Rene Descartes, Paris
  • fYear
    2006
  • fDate
    22-24 May 2006
  • Firstpage
    187
  • Lastpage
    194
  • Abstract
    This paper describes experiments using reinforcement learning techniques to compute pattern urgencies used during simulations performed in a Monte-Carlo Go architecture. Currently, Monte-Carlo is a popular technique for computer Go. In a previous study, Monte-Carlo was associated with domain-dependent knowledge in the Go-playing program Indigo. In 2003, a 3times3 pattern database was built manually. This paper explores the possibility of using reinforcement learning to automatically tune the 3times3 pattern urgencies. On 9times9 boards, within the Monte-Carlo architecture of Indigo, the result obtained by our automatic learning experiments is better than the manual method by a 3-point margin on average, which is satisfactory. Although the current results are promising on 19times19 boards, obtaining strictly positive results with such a large size remains to be done
  • Keywords
    computer games; learning (artificial intelligence); Go-playing program; Indigo; Monte-Carlo Go architecture; pattern database; pattern urgencies computing; reinforcement learning; Computational modeling; Computer architecture; Computer science; Databases; Distributed computing; Humans; Learning; Performance evaluation; Vocabulary; Computer Go; Monte-Carlo; Reinforcement Learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence and Games, 2006 IEEE Symposium on
  • Conference_Location
    Reno, NV
  • Print_ISBN
    1-4244-0464-9
  • Type

    conf

  • DOI
    10.1109/CIG.2006.311699
  • Filename
    4100126