• DocumentCode
    2717050
  • Title

    Fitted Q Iteration with CMACs

  • Author

    Timmer, Stephan ; Riedmiller, Martin

  • Author_Institution
    Dept. of Comput. Sci., Osnabrueck Univ.
  • fYear
    2007
  • fDate
    1-5 April 2007
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    A major issue in model-free reinforcement learning is how to efficiently exploit the data collected by an exploration strategy. This is especially important in case of continuous, high dimensional state spaces, since it is impossible to explore such spaces exhaustively. A simple but promising approach is to fix the number of state transitions which are sampled from the underlying Markov decision process. For several kernel-based learning algorithms there exist convergence proofs and notable empirical results, if a fixed set of transition instances is used. In this article, we will analyze how function approximators similar to the CMAC-architecture can be combined with this idea. We will show both analytically and empirically the potential power of the CMAC architecture combined with an offline version of Q-learning
  • Keywords
    Markov processes; cerebellar model arithmetic computers; computer architecture; iterative methods; learning (artificial intelligence); CMAC architecture; Markov decision process; Q-learning; fitted Q iteration; function approximators; kernel-based learning; reinforcement learning; Algorithm design and analysis; Computer science; Convergence; Dynamic programming; Inference algorithms; Interleaved codes; Sampling methods; Space exploration; State-space methods; Supervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Approximate Dynamic Programming and Reinforcement Learning, 2007. ADPRL 2007. IEEE International Symposium on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    1-4244-0706-0
  • Type

    conf

  • DOI
    10.1109/ADPRL.2007.368162
  • Filename
    4220807