• DocumentCode
    3306809
  • Title

    Labeling Q-learning embedded with knowledge update in partially observable mdp environments

  • Author

    Lee, Haeyeon ; Kamaya, Hiroyuki ; Abe, Kenichi

  • Author_Institution
    Electr. & Commun. Eng., Tohoku Univ., Aoba
  • fYear
    2004
  • fDate
    Aug. 30 2004-Sept. 1 2004
  • Firstpage
    329
  • Lastpage
    333
  • Abstract
    In POMDP (partially observable Markov decision process) environments, a learning agent cannot observe the environment directly, thus partially observed states appeared. In order to overcome this partially observable problem, we had proposed a new RL (reinforcement learning) algorithm, called "labeling Q-learning". Unlike the original LQ-learning, for an advanced LQ-learning, a prior knowledge about environment is prepared ahead of learning process. The knowledge is a kind of self-organizing classification of sequences (i.e. pattern of state transition). It provides the classified sequence which consists with passed states, here it is called "group". A new LQ-learning agent assumes the transition of groups to be a landmark-like labeling situation. In this paper, we try to extend the advanced LQ-learning based on knowledge update in more extended environment. In order to demonstrate LQ-learning embedded with knowledge which is even though made from another environment, we can apply it to grid-world problems shown in many literatures (Wiering and Schmidhuber, 1997)
  • Keywords
    learning (artificial intelligence); LQ-learning agent; POMDP environment; RL algorithm; grid-world problem; labeling Q-learning; learning process; partially observable Markov decision process; reinforcement learning; self-organizing classification; Educational institutions; History; Labeling; Learning systems; Registers; State estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Cybernetics, 2004. ICCC 2004. Second IEEE International Conference on
  • Conference_Location
    Vienna
  • Print_ISBN
    0-7803-8588-8
  • Type

    conf

  • DOI
    10.1109/ICCCYB.2004.1437741
  • Filename
    1437741