• DocumentCode
    2453394
  • Title

    Multimodal Parameter-exploring Policy Gradients

  • Author

    Sehnke, Frank ; Graves, Alex ; Osendorfer, Christian ; Schmidhuber, Jürgen

  • Author_Institution
    Tech. Univ. Munchen, München, Germany
  • fYear
    2010
  • fDate
    12-14 Dec. 2010
  • Firstpage
    113
  • Lastpage
    118
  • Abstract
    Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estimates encountered in normal policy gradient methods. It has been shown to drastically speed up convergence for several large-scale reinforcement learning tasks. However the independent normal distributions used by PGPE to search through parameter space are inadequate for some problems with multimodal reward surfaces. This paper extends the basic PGPE algorithm to use multimodal mixture distributions for each parameter, while remaining efficient. Experimental results on the Rastrigin function and the inverted pendulum benchmark demonstrate the advantages of this modification, with faster convergence to better optima.
  • Keywords
    gradient methods; learning (artificial intelligence); normal distribution; high-variance gradient estimates; independent normal distribution; large-scale reinforcement learning; model-free reinforcement learning; multimodal mixture distribution; multimodal parameter-exploring policy gradients; multimodal reward surfaces; normal policy gradient; parameter space; parameter-based exploration; Aerospace electronics; Benchmark testing; Convergence; Gradient methods; History; Learning; Probabilistic logic; Multi-Modal; Optimization; Parameter Exploration; Policy Gradients;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4244-9211-4
  • Type

    conf

  • DOI
    10.1109/ICMLA.2010.24
  • Filename
    5708821