Title :
A Grey Synthesis Approach to Efficient Architecture Design for Temporal Difference Learning
Author :
Hwang, Kao-Shing ; Lo, Chia-Yue ; Lee, Guan-Yuan
Author_Institution :
Nat. Chung-Cheng Univ., Chiayi, Taiwan
Abstract :
Temporal difference (TD) constitutes a class of methods for learning predictions in multistep prediction problems. The most important application of these methods is to temporal credit assignment in reinforcement learning. Although these TD procedures work in theory and in principle, its success is contingent on proper selection of parametric values. As well, its learning is majorly based on repeated exposures, which may not always be practical or feasible. This paper examines the issues of the efficient and general implementation of TD for hardware implementation of reinforcement learning algorithms by synthesizing the series of discounted sum of rewards along time. The proposed algorithm eliminates all step size parameters and improves data efficiency based on a synthetic approach of Grey theory. This paper also presents the stability of the proposed algorithm from the viewpoint of Grey theory. The algorithm along with a critic-actor reinforcement learning model is implemented in a System-on-a-Programmable-Chip (SOPC) board. In addition to comparing with the renowned model, adaptive heuristic critic (AHC), the results of experiments demonstrate that the proposed control mechanism can learn to control a system with very little a priori knowledge. Meanwhile, the effect of uncertainty in interactions between the system and the environment can be relaxed to some extent in the learning process of the proposed reinforcement learning agent.
Keywords :
grey systems; learning (artificial intelligence); programmable circuits; system-on-chip; temporal reasoning; architecture design; critic-actor reinforcement learning; grey synthesis approach; intelligent control; multistep prediction problems; system on a programmable chip; temporal credit assignment; temporal difference learning; Adaptive systems; Algorithm design and analysis; Artificial neural networks; Intelligent control; Learning; Mathematical model; Predictive models; Grey theory; intelligent control; reinforcement learning; temporal difference;
Journal_Title :
Mechatronics, IEEE/ASME Transactions on
DOI :
10.1109/TMECH.2010.2082558