• DocumentCode
    393477
  • Title

    Performance of LQ-learning in POMDP environments

  • Author

    Lee, Haeyeon ; Kama, Hiroyuki ; Abe, Kenich

  • Author_Institution
    Sch. of Eng., Tohoku Univ., Sendai, Japan
  • Volume
    2
  • fYear
    2002
  • fDate
    5-7 Aug. 2002
  • Firstpage
    819
  • Abstract
    In this paper, we propose a new type of LQ-learning to solve POMDP. In the POMDP environment, the agent cannot observe the environment directly. In the LQ-learning, in order to discriminate partially observed states, the agent attaches label to each observation which perceived as the same ones. Unlike our previous LQ-learning, we make preparations of knowledge about the environment in advance. The knowledge is automatically acquired by Kohenen´s Self-Organizing Map (SOM), which provides the knowledge about state transitions to the agent. Then, LQ-learning agent attaches labels to observations with reference to a map obtained by SOM.
  • Keywords
    Markov processes; learning (artificial intelligence); self-organising feature maps; Kohenen self-organizing map; LQ-learning performance; POMDP environments; partially observed Markov decision processes; reinforcement learning; Educational institutions; Feedback; History; Labeling; Learning; Leg; Organizing; Registers; Signal processing; State estimation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    SICE 2002. Proceedings of the 41st SICE Annual Conference
  • Print_ISBN
    0-7803-7631-5
  • Type

    conf

  • DOI
    10.1109/SICE.2002.1195263
  • Filename
    1195263