مرکز منطقه ای اطلاع رساني علوم و فناوري - A Clustering-Based Graph Laplacian Framework for Value Function Approximation in Reinforcement Learning

DocumentCode :

1758883

Title :

A Clustering-Based Graph Laplacian Framework for Value Function Approximation in Reinforcement Learning

Author :

Xin Xu ; Zhenhua Huang ; Graves, David ; Pedrycz, Witold

Author_Institution :

Coll. of Mechatron. & Autom., Nat. Univ. of Defense Technol., Changsha, China

Volume :

Issue :

fYear :

2014

fDate :

Dec. 2014

Firstpage :

2613

Lastpage :

2625

Abstract :

In order to deal with the sequential decision problems with large or continuous state spaces, feature representation and function approximation have been a major research topic in reinforcement learning (RL). In this paper, a clustering-based graph Laplacian framework is presented for feature representation and value function approximation (VFA) in RL. By making use of clustering-based techniques, that is, K-means clustering or fuzzy C-means clustering, a graph Laplacian is constructed by subsampling in Markov decision processes (MDPs) with continuous state spaces. The basis functions for VFA can be automatically generated from spectral analysis of the graph Laplacian. The clustering-based graph Laplacian is integrated with a class of approximation policy iteration algorithms called representation policy iteration (RPI) for RL in MDPs with continuous state spaces. Simulation and experimental results show that, compared with previous RPI methods, the proposed approach needs fewer sample points to compute an efficient set of basis functions and the learning control performance can be improved for a variety of parameter settings.

Keywords :

Markov processes; approximation theory; fuzzy set theory; graph theory; learning (artificial intelligence); pattern clustering; K-means clustering; Markov decision process; approximation policy iteration algorithm; clustering-based technique; continuous state space; feature representation; fuzzy C-means clustering; graph Laplacian framework; reinforcement learning; sequential decision problem; spectral analysis; value function approximation; Aerospace electronics; Approximation algorithms; Clustering algorithms; Economic indicators; Function approximation; Laplace equations; Approximate dynamic programming; Markov decision processes; clustering; learning control; reinforcement learning; value function approximation;

fLanguage :

English

Journal_Title :

Cybernetics, IEEE Transactions on

Publisher :

ieee

ISSN :

2168-2267

Type :

jour

DOI :

10.1109/TCYB.2014.2311578

Filename :

6805593

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1758883