• DocumentCode
    1203080
  • Title

    Reinforcement-Learning-Based Output-Feedback Control of Nonstrict Nonlinear Discrete-Time Systems With Application to Engine Emission Control

  • Author

    Shih, Peter ; Kaul, Brian C. ; Jagannathan, Sarangapani ; Drallmeier, James A.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Missouri Univ. of Sci. & Technol., Rolla, MO
  • Volume
    39
  • Issue
    5
  • fYear
    2009
  • Firstpage
    1162
  • Lastpage
    1179
  • Abstract
    A novel reinforcement-learning-based output adaptive neural network (NN) controller, which is also referred to as the adaptive-critic NN controller, is developed to deliver the desired tracking performance for a class of nonlinear discrete-time systems expressed in nonstrict feedback form in the presence of bounded and unknown disturbances. The adaptive-critic NN controller consists of an observer, a critic, and two action NNs. The observer estimates the states and output, and the two action NNs provide virtual and actual control inputs to the nonlinear discrete-time system. The critic approximates a certain strategic utility function, and the action NNs minimize the strategic utility function and control inputs. All NN weights adapt online toward minimization of a performance index, utilizing the gradient-descent-based rule, in contrast with iteration-based adaptive-critic schemes. Lyapunov functions are used to show the stability of the closed-loop tracking error, weights, and observer estimates. Separation and certainty equivalence principles, persistency of excitation condition, and linearity in the unknown parameter assumption are not needed. Experimental results on a spark ignition (SI) engine operating lean at an equivalence ratio of 0.75 show a significant (25%) reduction in cyclic dispersion in heat release with control, while the average fuel input changes by less than 1% compared with the uncontrolled case. Consequently, oxides of nitrogen (NOx) drop by 30%, and unburned hydrocarbons drop by 16% with control. Overall, NOx´s are reduced by over 80% compared with stoichiometric levels.
  • Keywords
    Lyapunov methods; adaptive control; closed loop systems; discrete time systems; emission; feedback; ignition; internal combustion engines; iterative methods; learning (artificial intelligence); neurocontrollers; nonlinear control systems; observers; performance index; sparks; Lyapunov function; adaptive-critic NN controller; bounded disturbance; closed-loop tracking error; engine emission control; gradient-descent-based rule; iteration; nonstrict feedback; nonstrict nonlinear discrete-time system; observer; output adaptive neural network controller; output feedback control; performance index; reinforcement learning; spark ignition engine; stability; strategic utility function; tracking performance; Adaptive critic; discrete-time system; engine emission control; nonstrict nonlinear output feedback; reinforcement learning control; Algorithms; Artificial Intelligence; Biomimetics; Computer Simulation; Electric Power Supplies; Feedback; Models, Theoretical; Nonlinear Dynamics; Pattern Recognition, Automated; Reinforcement (Psychology); Signal Processing, Computer-Assisted; Vehicle Emissions;
  • fLanguage
    English
  • Journal_Title
    Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4419
  • Type

    jour

  • DOI
    10.1109/TSMCB.2009.2013272
  • Filename
    4804687