• DocumentCode
    1437628
  • Title

    Mean Field for Markov Decision Processes: From Discrete to Continuous Optimization

  • Author

    Gast, Nicolas ; Gaujal, Bruno ; Le Boudec, Jean-Yves

  • Author_Institution
    LCA2, EPFL, Lausanne, Switzerland
  • Volume
    57
  • Issue
    9
  • fYear
    2012
  • Firstpage
    2266
  • Lastpage
    2280
  • Abstract
    We study the convergence of Markov decision processes, composed of a large number of objects, to optimization problems on ordinary differential equations. We show that the optimal reward of such a Markov decision process, which satisfies a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov decision process. We give bounds on the difference of the rewards and an algorithm for deriving an approximating solution to the Markov decision process from a solution of the HJB equations. We illustrate the method on three examples pertaining, respectively, to investment strategies, population dynamics control and scheduling in queues. They are used to illustrate and justify the construction of the controlled ODE and to show the advantage of solving a continuous HJB equation rather than a large discrete Bellman equation.
  • Keywords
    Jacobian matrices; Markov processes; approximation theory; convergence of numerical methods; decision theory; differential equations; dynamic programming; Bellman equation; Hamilton-Jacobi-Bellman equation; Markov decision processes; continuous HJB equation; continuous optimization; discrete optimization; dynamic optimization problems; dynamic programming; investment strategies; mean field approximation; ordinary differential equations; population dynamics control; Approximation methods; Convergence; Equations; Limiting; Manganese; Markov processes; Optimization; Epidemic model; Hamilton–Jacobi–Bellman (HJB); Markov decision processes; mean field; optimal control;
  • fLanguage
    English
  • Journal_Title
    Automatic Control, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9286
  • Type

    jour

  • DOI
    10.1109/TAC.2012.2186176
  • Filename
    6144708