• DocumentCode
    3011501
  • Title

    Mixed Reinforcement Learning for Partially Observable Markov Decision Process

  • Author

    Dung, Le Tien ; Komeda, Takashi ; Takagi, Motoki

  • Author_Institution
    Shibaura Inst. of Technol., Tokyo
  • fYear
    2007
  • fDate
    20-23 June 2007
  • Firstpage
    7
  • Lastpage
    12
  • Abstract
    Reinforcement learning has been widely used to solve problems with a little feedback from environment. Q learning can solve full observable Markov decision processes quite well. For partially observable Markov decision processes (POMDPs), a recurrent neural network (RNN) can be used to approximate Q values. However, learning time for these problems is typically very long. In this paper, Mixed Reinforcement Learning is presented to find an optimal policy for POMDPs in a shorter learning time. This method uses both a Q value table and a RNN. Q value table stores Q values for full observable states and the RNN approximates Q values for hidden states. An observable degree is calculated for each state while the agent explores the environment. If the observable degree is less than a threshold, the state is considered as a hidden state. Results of experiment in lighting grid world problem show that the proposed method enables an agent to acquire a policy, as good as the policy acquired by using only a RNN, with better learning performance.
  • Keywords
    Markov processes; learning (artificial intelligence); recurrent neural nets; Q learning; mixed reinforcement learning; partially observable markov decision process; recurrent neural network; Computational intelligence; History; Learning; Neural networks; Neurofeedback; Recurrent neural networks; Robotics and automation; State-space methods; Table lookup; USA Councils;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Robotics and Automation, 2007. CIRA 2007. International Symposium on
  • Conference_Location
    Jacksonville, FI
  • Print_ISBN
    1-4244-0790-7
  • Electronic_ISBN
    1-4244-0790-7
  • Type

    conf

  • DOI
    10.1109/CIRA.2007.382910
  • Filename
    4269910