Title :
Temporal difference learning with Interpolated N-Tuple networks: initial results on pole balancing
Author :
Abdullahi, Aisha A. ; Lucas, Simon M.
Author_Institution :
Sch. of Comput. Sci. & Electron. Eng., Univ. of Essex, Colchester, UK
Abstract :
Temporal difference learning (TDL) is perhaps the most widely used reinforcement learning method and gives competitive results on a range of problems, especially when using linear or table-based function approximators. However, it has been shown to give poor results on some continuous control problems and an important question is how it can be applied to such problems more effectively. The crucial point is how TDL can be generalized and scaled to deal with complex, high-dimensional problems without suffering from the curse of dimensionality. We introduce a new function approximation architecture called the Interpolated N-Tuple network and perform a proof-of-concept test on a classic reinforcement learning problem of pole balancing. The results show the method to be highly effective on this problem. They offer an important counter-example to some recently reported results that showed neuro-evolution outperforming TDL. The TDL with Interpolated N-Tuple networks learns to balance the pole considerably faster than the leading neuro-evolution techniques.
Keywords :
approximation theory; interpolation; learning (artificial intelligence); continuous control problem; function approximation architecture; high dimensional problem; interpolated N-tuple network; pole balancing; proof of concept test; reinforcement learning method; table based function approximator; temporal difference learning; Algorithm design and analysis; Approximation algorithms; Function approximation; Interpolation; Learning; Mathematical model;
Conference_Titel :
Computational Intelligence (UKCI), 2010 UK Workshop on
Conference_Location :
Colchester
Print_ISBN :
978-1-4244-8774-5
Electronic_ISBN :
978-1-4244-8773-8
DOI :
10.1109/UKCI.2010.5625609