DocumentCode :
349968
Title :
Labeling Q-learning for non-Markovian environments
Author :
Lee, Hae Yeon ; Kamaya, Hiroyuki ; Abe, Kenichi
Author_Institution :
Dept. of Electr. & Commun. Eng., Tohoku Univ., Sendai, Japan
Volume :
5
fYear :
1999
fDate :
1999
Firstpage :
487
Abstract :
The most widely used reinforcement learning (RL) algorithms, such as Q-learning and TD (λ) are limited to Markovian environments. Recent research on reinforcement learning algorithms has concentrated on partially observable Markov decision process (POMDP). The only way to overcome partial observability is to use memory to estimate state. In this paper, we present a new memory architecture of RL algorithms to solve certain type of POMDPs. Our algorithm, which we call labeling Q-learning (LQ-learning), is applied to test problems of simple mazes taken from recent literature. The results demonstrate LQ-learning´s ability to work well in near optimal manner
Keywords :
learning (artificial intelligence); memory architecture; observability; state estimation; Markov decision process; labeling Q-learning; memory architecture; partial observability; reinforcement learning; state estimation; Educational institutions; Feedback; Labeling; Learning; Memory architecture; Observability; Signal processing; State estimation; Testing; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics, 1999. IEEE SMC '99 Conference Proceedings. 1999 IEEE International Conference on
Conference_Location :
Tokyo
ISSN :
1062-922X
Print_ISBN :
0-7803-5731-0
Type :
conf
DOI :
10.1109/ICSMC.1999.815599
Filename :
815599
Link To Document :
بازگشت