Title :
TP-XCS: An XCS classifier system with fixed-length memory for reinforcement learning
Author :
Pickering, Tom ; Kovacs, Tim
Author_Institution :
Department of Computer Science, University of Bristol, Bristol, U.K.
Abstract :
We introduce a rule-based reinforcement learning system named Temporally Perceptive XCS (TP-XCS) that incorporates memory into the well-known XCS Learning Classifier System, to disambiguate perceptually aliased states in Partially Observable Markov Decision Processes (POMDPs), and hence to greatly outperform the basic (memoryless) XCS in such problems. TP-XCS augments the input to XCS with a fixed-length window of XCS´s sensory perceptions from previous time steps. The length of the window is a parameter set in advance and fixed during the run. This is a very simple approach to adding memory and it has the disadvantage that the size of the state/action space grows dramatically as the window is made longer, that is, it exacerbates the “curse of dimensionality” all RL systems face. However, XCS is able to generalize effectively over irrelevant inputs by using a genetic algorithm to find useful state aggregations, and our results show that TP-XCS inherits this ability and is able to generalize effectively over irrelevant memories in two small POMDPs called Woods100 and Woods101.
Keywords :
Face; Genetic algorithms; Learning (artificial intelligence); Markov processes; Memory management; Sociology; Statistics;
Conference_Titel :
Evolutionary Computation (CEC), 2015 IEEE Congress on
Conference_Location :
Sendai, Japan
DOI :
10.1109/CEC.2015.7257265