DocumentCode
669407
Title
Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems
Author
Tae Yoon Chun ; Jin Bae Park ; Yoon Ho Choi
Author_Institution
Dept. of Electr. Eng., Yonsei Univ., Seoul, South Korea
fYear
2013
fDate
20-23 Oct. 2013
Firstpage
454
Lastpage
458
Abstract
This paper presents the properties of policy iteration (PI)-mode monotone convergence and stability of generalized policy iteration (OPI) algorithms for discrete-time (DT) linear systems. OPI is one of the reinforcement learning based dynamic programming (DP) methods for solving optimal control problems, interacting policy evaluation and policy improvement steps. To deal with the convergence and stability of GPI, several equivalent equations are derived. Then, as a result, the PI-mode monotone convergence (one behaves like PI) and stability of GPI algorithm are proved under the some initial conditions which are closely related with Lyapunov approach. Finally, some numerical simulations are performed to verify the proposed convergence and stability properties.
Keywords
Lyapunov methods; discrete time systems; dynamic programming; learning (artificial intelligence); optimal control; stability; GPI; Lyapunov approach; PI-mode monotone convergence; discrete-time linear system; dynamic programming; generalized policy iteration; optimal control problem; policy iteration-mode monotone convergence; reinforcement learning; stability property; Approximation algorithms; Education; Stability analysis; generalized policy iteration; linear quadratic regulator; policy iteration-mode monotone convergence;
fLanguage
English
Publisher
ieee
Conference_Titel
Control, Automation and Systems (ICCAS), 2013 13th International Conference on
Conference_Location
Gwangju
ISSN
2093-7121
Print_ISBN
978-89-93215-05-2
Type
conf
DOI
10.1109/ICCAS.2013.6703973
Filename
6703973
Link To Document