DocumentCode
393477
Title
Performance of LQ-learning in POMDP environments
Author
Lee, Haeyeon ; Kama, Hiroyuki ; Abe, Kenich
Author_Institution
Sch. of Eng., Tohoku Univ., Sendai, Japan
Volume
2
fYear
2002
fDate
5-7 Aug. 2002
Firstpage
819
Abstract
In this paper, we propose a new type of LQ-learning to solve POMDP. In the POMDP environment, the agent cannot observe the environment directly. In the LQ-learning, in order to discriminate partially observed states, the agent attaches label to each observation which perceived as the same ones. Unlike our previous LQ-learning, we make preparations of knowledge about the environment in advance. The knowledge is automatically acquired by Kohenen´s Self-Organizing Map (SOM), which provides the knowledge about state transitions to the agent. Then, LQ-learning agent attaches labels to observations with reference to a map obtained by SOM.
Keywords
Markov processes; learning (artificial intelligence); self-organising feature maps; Kohenen self-organizing map; LQ-learning performance; POMDP environments; partially observed Markov decision processes; reinforcement learning; Educational institutions; Feedback; History; Labeling; Learning; Leg; Organizing; Registers; Signal processing; State estimation;
fLanguage
English
Publisher
ieee
Conference_Titel
SICE 2002. Proceedings of the 41st SICE Annual Conference
Print_ISBN
0-7803-7631-5
Type
conf
DOI
10.1109/SICE.2002.1195263
Filename
1195263
Link To Document