مرکز منطقه ای اطلاع رساني علوم و فناوري - Learning with binary-valued utility using derivative adaptive critic methods

DocumentCode :

423959

Title :

Learning with binary-valued utility using derivative adaptive critic methods

Author :

Matzner, Shari A. ; Shannon, Thaddeus T. ; Lendaris, George G.

Author_Institution :

NW Comput. Intelligence Lab., Portland State Univ., OR, USA

Volume :

fYear :

2004

fDate :

25-29 July 2004

Firstpage :

1805

Abstract :

Adaptive critic methods for reinforcement learning are known to provide consistent solutions to optimal control problems, and are also considered plausible models for cognitive learning processes. This work discusses binary reinforcement in the context of three adaptive critic methods: heuristic dynamic programming (HDP), dual heuristic programming (DHP), and globalized dual heuristic programming (GDHP). Binary reinforcement arises when the qualitative measure of success is simply "pass" or "fail". We implement binary reinforcement with adaptive critic methods for the pole-cart benchmark problem. Results demonstrate two qualitatively dissimilar classes of controllers: those that replicate the system stabilization achieved with quadratic utility, and those that merely succeed at not dropping the pole. It is found that the GDHP method is effective for learning an approximately optimal solution, with results comparable to those obtained via DHP that uses a more informative, quadratic utility function.

Keywords :

adaptive systems; cognitive systems; dynamic programming; heuristic programming; learning (artificial intelligence); optimal control; stability; binary reinforcement learning; binary valued quadratic utility function; cognitive learning processes; derivative adaptive critic methods; dual heuristic programming; dynamic programming; optimal control problems; pole-cart benchmark problem; system stabilization; Adaptive control; Computational intelligence; Control systems; Cost function; Dynamic programming; Learning; Optimal control; Programmable control; State feedback; State-space methods;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Neural Networks, 2004. Proceedings. 2004 IEEE International Joint Conference on

ISSN :

1098-7576

Print_ISBN :

0-7803-8359-1

Type :

conf

DOI :

10.1109/IJCNN.2004.1380882

Filename :

1380882

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=423959