DocumentCode
2226464
Title
TP-XCS: An XCS classifier system with fixed-length memory for reinforcement learning
Author
Pickering, Tom ; Kovacs, Tim
Author_Institution
Department of Computer Science, University of Bristol, Bristol, U.K.
fYear
2015
fDate
25-28 May 2015
Firstpage
3020
Lastpage
3025
Abstract
We introduce a rule-based reinforcement learning system named Temporally Perceptive XCS (TP-XCS) that incorporates memory into the well-known XCS Learning Classifier System, to disambiguate perceptually aliased states in Partially Observable Markov Decision Processes (POMDPs), and hence to greatly outperform the basic (memoryless) XCS in such problems. TP-XCS augments the input to XCS with a fixed-length window of XCS´s sensory perceptions from previous time steps. The length of the window is a parameter set in advance and fixed during the run. This is a very simple approach to adding memory and it has the disadvantage that the size of the state/action space grows dramatically as the window is made longer, that is, it exacerbates the “curse of dimensionality” all RL systems face. However, XCS is able to generalize effectively over irrelevant inputs by using a genetic algorithm to find useful state aggregations, and our results show that TP-XCS inherits this ability and is able to generalize effectively over irrelevant memories in two small POMDPs called Woods100 and Woods101.
Keywords
Face; Genetic algorithms; Learning (artificial intelligence); Markov processes; Memory management; Sociology; Statistics;
fLanguage
English
Publisher
ieee
Conference_Titel
Evolutionary Computation (CEC), 2015 IEEE Congress on
Conference_Location
Sendai, Japan
Type
conf
DOI
10.1109/CEC.2015.7257265
Filename
7257265
Link To Document