• DocumentCode
    2226464
  • Title

    TP-XCS: An XCS classifier system with fixed-length memory for reinforcement learning

  • Author

    Pickering, Tom ; Kovacs, Tim

  • Author_Institution
    Department of Computer Science, University of Bristol, Bristol, U.K.
  • fYear
    2015
  • fDate
    25-28 May 2015
  • Firstpage
    3020
  • Lastpage
    3025
  • Abstract
    We introduce a rule-based reinforcement learning system named Temporally Perceptive XCS (TP-XCS) that incorporates memory into the well-known XCS Learning Classifier System, to disambiguate perceptually aliased states in Partially Observable Markov Decision Processes (POMDPs), and hence to greatly outperform the basic (memoryless) XCS in such problems. TP-XCS augments the input to XCS with a fixed-length window of XCS´s sensory perceptions from previous time steps. The length of the window is a parameter set in advance and fixed during the run. This is a very simple approach to adding memory and it has the disadvantage that the size of the state/action space grows dramatically as the window is made longer, that is, it exacerbates the “curse of dimensionality” all RL systems face. However, XCS is able to generalize effectively over irrelevant inputs by using a genetic algorithm to find useful state aggregations, and our results show that TP-XCS inherits this ability and is able to generalize effectively over irrelevant memories in two small POMDPs called Woods100 and Woods101.
  • Keywords
    Face; Genetic algorithms; Learning (artificial intelligence); Markov processes; Memory management; Sociology; Statistics;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Evolutionary Computation (CEC), 2015 IEEE Congress on
  • Conference_Location
    Sendai, Japan
  • Type

    conf

  • DOI
    10.1109/CEC.2015.7257265
  • Filename
    7257265