مرکز منطقه ای اطلاع رساني علوم و فناوري - TP-XCS: An XCS classifier system with fixed-length memory for reinforcement learning

DocumentCode :

2226464

Title :

TP-XCS: An XCS classifier system with fixed-length memory for reinforcement learning

Author :

Pickering, Tom ; Kovacs, Tim

Author_Institution :

Department of Computer Science, University of Bristol, Bristol, U.K.

fYear :

2015

fDate :

25-28 May 2015

Firstpage :

3020

Lastpage :

3025

Abstract :

We introduce a rule-based reinforcement learning system named Temporally Perceptive XCS (TP-XCS) that incorporates memory into the well-known XCS Learning Classifier System, to disambiguate perceptually aliased states in Partially Observable Markov Decision Processes (POMDPs), and hence to greatly outperform the basic (memoryless) XCS in such problems. TP-XCS augments the input to XCS with a fixed-length window of XCS´s sensory perceptions from previous time steps. The length of the window is a parameter set in advance and fixed during the run. This is a very simple approach to adding memory and it has the disadvantage that the size of the state/action space grows dramatically as the window is made longer, that is, it exacerbates the “curse of dimensionality” all RL systems face. However, XCS is able to generalize effectively over irrelevant inputs by using a genetic algorithm to find useful state aggregations, and our results show that TP-XCS inherits this ability and is able to generalize effectively over irrelevant memories in two small POMDPs called Woods100 and Woods101.

Keywords :

Face; Genetic algorithms; Learning (artificial intelligence); Markov processes; Memory management; Sociology; Statistics;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Evolutionary Computation (CEC), 2015 IEEE Congress on

Conference_Location :

Sendai, Japan

Type :

conf

DOI :

10.1109/CEC.2015.7257265

Filename :

7257265

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2226464