• DocumentCode
    88632
  • Title

    Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming

  • Author

    Qinglai Wei ; Fei-Yue Wang ; Derong Liu ; Xiong Yang

  • Author_Institution
    State Key Lab. of Manage. & Control for Complex Syst., Inst. of Autom., Beijing, China
  • Volume
    44
  • Issue
    12
  • fYear
    2014
  • fDate
    Dec. 2014
  • Firstpage
    2820
  • Lastpage
    2833
  • Abstract
    In this paper, a new iterative adaptive dynamic programming (ADP) algorithm is developed to solve optimal control problems for infinite horizon discrete-time nonlinear systems with finite approximation errors. First, a new generalized value iteration algorithm of ADP is developed to make the iterative performance index function converge to the solution of the Hamilton-Jacobi-Bellman equation. The generalized value iteration algorithm permits an arbitrary positive semi-definite function to initialize it, which overcomes the disadvantage of traditional value iteration algorithms. When the iterative control law and iterative performance index function in each iteration cannot accurately be obtained, for the first time a new “design method of the convergence criteria” for the finite-approximation-error-based generalized value iteration algorithm is established. A suitable approximation error can be designed adaptively to make the iterative performance index function converge to a finite neighborhood of the optimal performance index function. Neural networks are used to implement the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the developed method.
  • Keywords
    adaptive control; approximation theory; discrete time systems; dynamic programming; infinite horizon; iterative methods; neurocontrollers; nonlinear control systems; optimal control; Hamilton-Jacobi-Bellman equation; finite approximation error; finite neighborhood; finite-approximation-error-based discrete-time iterative adaptive dynamic programming; finite-approximation-error-based generalized value iteration algorithm; infinite horizon discrete-time nonlinear systems; iterative ADP algorithm; iterative adaptive dynamic programming algorithm; iterative control law; iterative performance index function; neural networks; optimal control problem; optimal performance index function; positive semidefinite function; traditional value iteration algorithm; Adaptive critic designs; adaptive dynamic programming (ADP); approximate dynamic programming; approximation error; neural networks; neuro-dynamic programming; nonlinear systems; optimal control; reinforcement learning; value iteration;
  • fLanguage
    English
  • Journal_Title
    Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-2267
  • Type

    jour

  • DOI
    10.1109/TCYB.2014.2354377
  • Filename
    6912005