DocumentCode
406112
Title
Credit of optimal state transition based reinforcement learning algorithm
Author
Bai, Tingfeng ; Wu, Gengfeng
Author_Institution
Sch. of Comput. Sci. & Eng., Shanghai Univ., China
Volume
1
fYear
2003
fDate
14-17 Dec. 2003
Firstpage
62
Abstract
This paper proposed an optimal model based on the distance between current state and goal state and the cost of state transition in order to solve goal state problem more effectively. Based on the optimal model, a unique reinforcement learning algorithm named COSTRLA (credit of optimal state transition based reinforcement learning algorithm) is also presented. The COSTRLA defined a COST function used to evaluate optimality of output strategy, developed update principle for the COST function based on the dynamic programming principle, while reinforcement signal is defined as the distance from current state to goal state. The COSTRLA was applied into cooperative control of Buddy-Arnolds robot. The simulation experiment has shown the advantages of COSTRLA over some popular reinforcement learning algorithms such as Q-learning and prioritized sweeping algorithms.
Keywords
cooperative systems; dynamic programming; learning (artificial intelligence); multi-robot systems; optimal systems; dynamic programming; optimal model; optimal state transition; prioritized sweeping algorithms; reinforcement learning algorithm; Computer science; Cost function; Dynamic programming; Learning; Predictive models; Robot control;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks and Signal Processing, 2003. Proceedings of the 2003 International Conference on
Conference_Location
Nanjing
Print_ISBN
0-7803-7702-8
Type
conf
DOI
10.1109/ICNNSP.2003.1279213
Filename
1279213
Link To Document