مرکز منطقه ای اطلاع رساني علوم و فناوري - Labeling Q-learning for non-Markovian environments

DocumentCode :

349968

Title :

Labeling Q-learning for non-Markovian environments

Author :

Lee, Hae Yeon ; Kamaya, Hiroyuki ; Abe, Kenichi

Author_Institution :

Dept. of Electr. & Commun. Eng., Tohoku Univ., Sendai, Japan

Volume :

fYear :

1999

fDate :

1999

Firstpage :

487

Abstract :

The most widely used reinforcement learning (RL) algorithms, such as Q-learning and TD (λ) are limited to Markovian environments. Recent research on reinforcement learning algorithms has concentrated on partially observable Markov decision process (POMDP). The only way to overcome partial observability is to use memory to estimate state. In this paper, we present a new memory architecture of RL algorithms to solve certain type of POMDPs. Our algorithm, which we call labeling Q-learning (LQ-learning), is applied to test problems of simple mazes taken from recent literature. The results demonstrate LQ-learning´s ability to work well in near optimal manner

Keywords :

learning (artificial intelligence); memory architecture; observability; state estimation; Markov decision process; labeling Q-learning; memory architecture; partial observability; reinforcement learning; state estimation; Educational institutions; Feedback; Labeling; Learning; Memory architecture; Observability; Signal processing; State estimation; Testing; Working environment noise;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on

Conference_Location :

Tokyo

ISSN :

1062-922X

Print_ISBN :

0-7803-5731-0

Type :

conf

DOI :

10.1109/ICSMC.1999.815599

Filename :

815599

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=349968