Credit of optimal state transition based reinforcement learning algorithm

Author

Bai, Tingfeng ; Wu, Gengfeng

Author_Institution

Sch. of Comput. Sci. & Eng., Shanghai Univ., China

Volume

1

fYear

2003

fDate

14-17 Dec. 2003

Firstpage

62

Abstract

This paper proposed an optimal model based on the distance between current state and goal state and the cost of state transition in order to solve goal state problem more effectively. Based on the optimal model, a unique reinforcement learning algorithm named COSTRLA (credit of optimal state transition based reinforcement learning algorithm) is also presented. The COSTRLA defined a COST function used to evaluate optimality of output strategy, developed update principle for the COST function based on the dynamic programming principle, while reinforcement signal is defined as the distance from current state to goal state. The COSTRLA was applied into cooperative control of Buddy-Arnolds robot. The simulation experiment has shown the advantages of COSTRLA over some popular reinforcement learning algorithms such as Q-learning and prioritized sweeping algorithms.

Keywords

cooperative systems; dynamic programming; learning (artificial intelligence); multi-robot systems; optimal systems; dynamic programming; optimal model; optimal state transition; prioritized sweeping algorithms; reinforcement learning algorithm; Computer science; Cost function; Dynamic programming; Learning; Predictive models; Robot control;

fLanguage

English

Publisher

ieee

Conference_Titel

Neural Networks and Signal Processing, 2003. Proceedings of the 2003 International Conference on

Conference_Location

Nanjing

Print_ISBN

0-7803-7702-8

Type

conf

DOI

10.1109/ICNNSP.2003.1279213

Filename

1279213