Optimistic planning for continuous-action deterministic systems

Author

Busoniu, L. ; Daniels, Andrew ; Munos, Remi ; Babuska, Robert

Author_Institution

CRAN, Univ. de Lorraine, Vandoeuvre les Nancy, France

fYear

2013

fDate

16-19 April 2013

Firstpage

69

Lastpage

76

Abstract

We consider the class of online planning algorithms for optimal control, which compared to dynamic programming are relatively unaffected by large state dimensionality. We introduce a novel planning algorithm called SOOP that works for deterministic systems with continuous states and actions. SOOP is the first method to explore the true solution space, consisting of infinite sequences of continuous actions, without requiring knowledge about the smoothness of the system. SOOP can be used parameter-free at the cost of more model calls, but we also propose a more practical variant tuned by a parameter α, which balances finer discretization with longer planning horizons. Experiments on three problems show SOOP reliably ranks among the best algorithms, fully dominating competing methods when the problem requires both long horizons and fine discretization.

Keywords

Markov processes; dynamic programming; optimal control; Markov decision process; SOOP; continuous-action deterministic systems; dynamic programming; online planning algorithm; optimal control; optimistic planning; Aerospace electronics; Dynamic programming; Heuristic algorithms; Measurement; Optimization; Planning; Upper bound;

fLanguage

English

Publisher

ieee

Conference_Titel

Adaptive Dynamic Programming And Reinforcement Learning (ADPRL), 2013 IEEE Symposium on

Conference_Location

Singapore

ISSN

2325-1824

Type

conf

DOI

10.1109/ADPRL.2013.6614991

Filename

6614991