• DocumentCode
    1748878
  • Title

    Competitive reinforcement learning for combinatorial problems

  • Author

    Abramson, Myriam ; Wechsler, Harry

  • Author_Institution
    George Mason Univ., Fairfax, VA, USA
  • Volume
    4
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    2333
  • Abstract
    This paper shows that the competitive learning rule found in learning vector quantization (LVQ) serves as a promising function approximator to enable reinforcement learning methods to cope with a large decision search space, defined in terms of difference classes of input patterns, like those found in the game of Go. This paper describes S[arsa]LVQ, a novel reinforcement learning algorithm and shows its feasibility for Go. As the distributed LVQ representation corresponds to a (quantized) codebook of compressed and generalized pattern templates, the state space requirements for online reinforcement methods are significantly reduced, thus decreasing the complexity of the decision space and consequently improving the play performance. As a result of competitive learning, SLVQ can win against heuristic players and starts to level off against stronger opponents such as Wally. SLVQ outperforms S[arsa]Linear when playing against both a heuristic player and Wally. Furthermore, while playing Go, SLVQ learns to stay alive while SLinear fails to do so
  • Keywords
    function approximation; games of skill; search problems; self-organising feature maps; unsupervised learning; vector quantisation; GO game; codebook; competitive learning; data compression; function approximation; learning vector quantization; reinforcement learning; search space; self organisation; Delay; Joining processes; Machine learning; Shape; State-space methods; Vector quantization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2001. Proceedings. IJCNN '01. International Joint Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    1098-7576
  • Print_ISBN
    0-7803-7044-9
  • Type

    conf

  • DOI
    10.1109/IJCNN.2001.938727
  • Filename
    938727