DocumentCode :
393477
Title :
Performance of LQ-learning in POMDP environments
Author :
Lee, Haeyeon ; Kama, Hiroyuki ; Abe, Kenich
Author_Institution :
Sch. of Eng., Tohoku Univ., Sendai, Japan
Volume :
2
fYear :
2002
fDate :
5-7 Aug. 2002
Firstpage :
819
Abstract :
In this paper, we propose a new type of LQ-learning to solve POMDP. In the POMDP environment, the agent cannot observe the environment directly. In the LQ-learning, in order to discriminate partially observed states, the agent attaches label to each observation which perceived as the same ones. Unlike our previous LQ-learning, we make preparations of knowledge about the environment in advance. The knowledge is automatically acquired by Kohenen´s Self-Organizing Map (SOM), which provides the knowledge about state transitions to the agent. Then, LQ-learning agent attaches labels to observations with reference to a map obtained by SOM.
Keywords :
Markov processes; learning (artificial intelligence); self-organising feature maps; Kohenen self-organizing map; LQ-learning performance; POMDP environments; partially observed Markov decision processes; reinforcement learning; Educational institutions; Feedback; History; Labeling; Learning; Leg; Organizing; Registers; Signal processing; State estimation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
SICE 2002. Proceedings of the 41st SICE Annual Conference
Print_ISBN :
0-7803-7631-5
Type :
conf
DOI :
10.1109/SICE.2002.1195263
Filename :
1195263
Link To Document :
بازگشت