DocumentCode
3306809
Title
Labeling Q-learning embedded with knowledge update in partially observable mdp environments
Author
Lee, Haeyeon ; Kamaya, Hiroyuki ; Abe, Kenichi
Author_Institution
Electr. & Commun. Eng., Tohoku Univ., Aoba
fYear
2004
fDate
Aug. 30 2004-Sept. 1 2004
Firstpage
329
Lastpage
333
Abstract
In POMDP (partially observable Markov decision process) environments, a learning agent cannot observe the environment directly, thus partially observed states appeared. In order to overcome this partially observable problem, we had proposed a new RL (reinforcement learning) algorithm, called "labeling Q-learning". Unlike the original LQ-learning, for an advanced LQ-learning, a prior knowledge about environment is prepared ahead of learning process. The knowledge is a kind of self-organizing classification of sequences (i.e. pattern of state transition). It provides the classified sequence which consists with passed states, here it is called "group". A new LQ-learning agent assumes the transition of groups to be a landmark-like labeling situation. In this paper, we try to extend the advanced LQ-learning based on knowledge update in more extended environment. In order to demonstrate LQ-learning embedded with knowledge which is even though made from another environment, we can apply it to grid-world problems shown in many literatures (Wiering and Schmidhuber, 1997)
Keywords
learning (artificial intelligence); LQ-learning agent; POMDP environment; RL algorithm; grid-world problem; labeling Q-learning; learning process; partially observable Markov decision process; reinforcement learning; self-organizing classification; Educational institutions; History; Labeling; Learning systems; Registers; State estimation;
fLanguage
English
Publisher
ieee
Conference_Titel
Computational Cybernetics, 2004. ICCC 2004. Second IEEE International Conference on
Conference_Location
Vienna
Print_ISBN
0-7803-8588-8
Type
conf
DOI
10.1109/ICCCYB.2004.1437741
Filename
1437741
Link To Document