• DocumentCode
    677972
  • Title

    Deep Belief Network for Modeling Hierarchical Reinforcement Learning Policies

  • Author

    Djurdjevic, Predrag D. ; Huber, Marco

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Texas at Arlington, Arlington, TX, USA
  • fYear
    2013
  • fDate
    13-16 Oct. 2013
  • Firstpage
    2485
  • Lastpage
    2491
  • Abstract
    Intelligent agents over their lifetime face multiple tasks that require simultaneous modeling and control of complex, initially unknown environments, observed via incomplete and uncertain observations. In such scenarios, policy learning is subject to the curse of dimensionality, leading to scaling problems for traditional Reinforcement Learning (RL). To address this, the agent has to efficiently acquire and reuse latent knowledge. One way is through Hierarchical Reinforcement Learning (HRL), which embellishes RL with a hierarchical, model-based approach to state, reward and policy representation. This paper presents a novel learning approach for HRL based on Conditional Restricted Boltzmann Machines (CRBMs). The proposed model provides a uniform means to simultaneously learn policies and associated abstract state features, and allows learning and executing hierarchical skills within a consistent, uniform network structure. In this model, learning is performed incrementally from basic grounded features to complex abstract policies based on automatically extracted latent states and rewards.
  • Keywords
    Boltzmann machines; belief networks; learning (artificial intelligence); CRBM; HRL; conditional restricted Boltzmann machines; deep belief network; hierarchical reinforcement learning policy modeling; intelligent agents; Abstracts; Buildings; Computational modeling; Context; Learning (artificial intelligence); Training; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on
  • Conference_Location
    Manchester
  • Type

    conf

  • DOI
    10.1109/SMC.2013.424
  • Filename
    6722177