مرکز منطقه ای اطلاع رساني علوم و فناوري - Approximate value iteration with randomized policies

DocumentCode :

1743644

Title :

Approximate value iteration with randomized policies

Author :

De Farias, D.P. ; Van Roy, B.

Author_Institution :

Stanford Univ., CA, USA

Volume :

fYear :

2000

fDate :

2000

Firstpage :

3421

Abstract :

The curse of dimensionality in dynamic programming prevents, in most problems of practical interest, the exact computation of the value function. We study the fixed points of approximate value iteration, a simple algorithm that combats the curse of dimensionality by generating approximate iterates of the classical value iteration algorithm in the span of a set of prespecified basis functions. We show that, in general, the modified dynamic programming operator need not possess a fixed point, and therefore, approximate value iteration should not be expected to converge. However, by using a class of randomized policies, approximate value iteration is guaranteed to possess at least one fixed point. We finally discuss the link between approximate value iteration and temporal-difference learning (TD), and show that the existence of fixed points for approximate value iteration implies existence of stationary points for the ordinary differential equation approximated by a version of TD that incorporates “exploration”

Keywords :

Markov processes; dynamic programming; function approximation; iterative methods; learning (artificial intelligence); approximate value iteration; modified dynamic programming operator; randomized policies; temporal-difference learning; Control systems; Differential equations; Dynamic programming; State-space methods; Stochastic processes;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Decision and Control, 2000. Proceedings of the 39th IEEE Conference on

Conference_Location :

Sydney, NSW

ISSN :

0191-2216

Print_ISBN :

0-7803-6638-7

Type :

conf

DOI :

10.1109/CDC.2000.912232

Filename :

912232

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1743644