• DocumentCode
    2060865
  • Title

    PAC learning for Markov decision processes and dynamic games

  • Author

    Jain, Rahul ; Varaiya, Pravin P.

  • Author_Institution
    EECS Dept., California Univ., Berkeley, CA, USA
  • fYear
    2004
  • fDate
    27 June-2 July 2004
  • Firstpage
    468
  • Abstract
    We extend the probably approximately correct (PAC) model of learning to Markov decision processes (MDPs) and dynamic games. We obtain simulation-based uniform sample complexity bounds for value function estimates of discounted reward MDPs. We also obtain uniform sample complexity results for Markov games with a finite number of players.
  • Keywords
    Markov processes; decision theory; game theory; Markov decision process; Markov game; dynamic game; function estimation; probably approximately correct learning; sample complexity bound; Contracts; Convergence; Markov processes; Noise generators; Space stations; State-space methods; Stochastic processes;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Theory, 2004. ISIT 2004. Proceedings. International Symposium on
  • Print_ISBN
    0-7803-8280-3
  • Type

    conf

  • DOI
    10.1109/ISIT.2004.1365505
  • Filename
    1365505