• DocumentCode
    2568477
  • Title

    Multiresolution state-space discretization method for Q-learning with function approximation and policy iteration

  • Author

    Lampton, Amanda ; Valasek, John

  • Author_Institution
    Dept. of Aerosp. Eng., Texas A&M Univ., College Station, TX, USA
  • fYear
    2009
  • fDate
    11-14 Oct. 2009
  • Firstpage
    2677
  • Lastpage
    2682
  • Abstract
    A multiresolution state-space discretization method is developed for the episodic unsupervised learning method of Q-learning. In addition, a genetic algorithm is used periodically during learning to approximate the action-value function. Policy iteration is added as a stopping criterion for the algorithm. For large scale problems Q-learning often suffers from the curse of dimensionality due to large numbers of possible state-action pairs. This paper develops a method whereby a state-space is adaptively discretized by progressively finer grids around the areas of interest within the state or learning space. Policy iteration is added to prevent unnecessary episodes at each level of discretization once the learning has converged. Utility of the method is demonstrated with application to the problem of a morphing airfoil with two morphing parameters (two state variables). By setting the multiresolution method to define the area of interest by the goal the agent seeks, it is shown that this method can learn a specific goal within ±0.002, while reducing the total number episodes needed to converge by 85% from the allotted total possible episodes. It is also shown that a good approximation of the action-value function is produced with 80% agreement between the tabulated and approximated policy, though empirically the approximated policy appears to be superior.
  • Keywords
    discrete systems; function approximation; genetic algorithms; intelligent robots; iterative methods; learning systems; optimal control; state-space methods; unsupervised learning; Q-learning; action-value function approximation; agent goal; dimensionality curse; episodic unsupervised learning method; genetic algorithm; intelligent robot; morphing airfoil problem; multiresolution state-space discretization method; optimal control policy iteration; state-action pair; state-space method; stopping criterion; Automotive components; Convergence; Cybernetics; Function approximation; Genetic algorithms; Large-scale systems; Orbital robotics; Space vehicles; USA Councils; Unsupervised learning; Function Approximation; Genetic Algorithm; Multiresolution; Policy Iteration; Q-learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man and Cybernetics, 2009. SMC 2009. IEEE International Conference on
  • Conference_Location
    San Antonio, TX
  • ISSN
    1062-922X
  • Print_ISBN
    978-1-4244-2793-2
  • Electronic_ISBN
    1062-922X
  • Type

    conf

  • DOI
    10.1109/ICSMC.2009.5346129
  • Filename
    5346129