• DocumentCode
    288691
  • Title

    A reinforcement learning approach to on-line optimal control

  • Author

    An, P.E. ; Aslam-Mir, S. ; Brown, M. ; Harris, C.J.

  • Author_Institution
    Dept. of Aeronaut. & Astronaut., Southampton Univ., UK
  • Volume
    4
  • fYear
    1994
  • fDate
    27 Jun-2 Jul 1994
  • Firstpage
    2465
  • Abstract
    Presents a hybrid control architecture for solving on-line optimal control. In this architecture, the control law is dynamically scheduled between a reinforcement controller and a stabilizing controller so that the closed-loop performance is smoothly transformed from a reactive behavior to one which can predict. Based on a modified Q-learning technique, the reinforcement controller is made of two components: policy and Q functions. The policy function is explicitly incorporated so as to bypass the minimum operator normally required for selecting actions and updating the Q function. This architecture is then applied to a repetitive operation using a second-order linear-time-variant plant with a nonlinear control structure. In this operation, the reinforcement signals are based on set-point errors and the reinforcement controller is generalized using second-order B-splines networks. This example illustrates how, for a, non-optimally tuned stabilizing controller, the closed-loop performance can be bootstrapped with the use of reinforcement learning. Results shows that the set-point performance of the hybrid controller is improved over that of the fixed structure controller by discovering better control strategies which compensate for the non-optimal gains and nonlinear control structure
  • Keywords
    learning (artificial intelligence); linear systems; nonlinear control systems; optimal control; splines (mathematics); time-varying systems; closed-loop performance; hybrid control architecture; modified Q-learning technique; nonlinear control structure; nonoptimally tuned stabilizing controller; online optimal control; reactive behavior; reinforcement controller; reinforcement learning; repetitive operation; second-order B-splines networks; second-order linear-time-variant plant; set-point errors; stabilizing controller; Control systems; Costs; Dynamic scheduling; Error correction; Kinematics; Nonlinear control systems; Optimal control; Sampling methods; Spline; Supervised learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 1994. IEEE World Congress on Computational Intelligence., 1994 IEEE International Conference on
  • Conference_Location
    Orlando, FL
  • Print_ISBN
    0-7803-1901-X
  • Type

    conf

  • DOI
    10.1109/ICNN.1994.374607
  • Filename
    374607