Acceleration of Reinforcement Learning by Policy Evaluation Using Nonstationary Iterative Method

Author

Senda, K. ; Hattori, Saki ; Hishinuma, Toru ; Kohda, Tohru

Author_Institution

Dept. of Aeronaut. & Astronaut., Kyoto Univ., Kyoto, Japan

Volume

44

Issue

12

fYear

2014

fDate

Dec. 2014

Firstpage

2696

Lastpage

2705

Abstract

Typical methods for solving reinforcement learning problems iterate two steps, policy evaluation and policy improvement. This paper proposes algorithms for the policy evaluation to improve learning efficiency. The proposed algorithms are based on the Krylov Subspace Method (KSM), which is a nonstationary iterative method. The algorithms based on KSM are tens to hundreds times more efficient than existing algorithms based on the stationary iterative methods. Algorithms based on KSM are far more efficient than they have been generally expected. This paper clarifies what makes algorithms based on KSM makes more efficient with numerical examples and theoretical discussions.

Keywords

iterative methods; learning (artificial intelligence); KSM; Krylov subspace method; learning efficiency; nonstationary iterative method; policy evaluation; policy improvement; reinforcement learning problems; stationary iterative methods; Convergence; Eigenvalues and eigenfunctions; Equations; Iterative methods; Learning (artificial intelligence); Q-factor; Vectors; Nonstationary iterative method; policy evaluation; policy iteration; reinforcement learning;

fLanguage

English

Journal_Title

Cybernetics, IEEE Transactions on

Publisher

ieee

ISSN

2168-2267

Type

jour

DOI

10.1109/TCYB.2014.2313655

Filename

6786366