Title :
Reinforcement learning combined with human feedback in continuous state and action spaces
Author :
Ngo Anh Vien ; Ertel, Wolfgang
Author_Institution :
Inst. of Artificial Intell., Ravensburg-Weingarten Univ. of Appl. Sci., Weingarten, Germany
Abstract :
We consider the problem of extending manually trained agents via evaluative reinforcement (TAMER) in continuous state and action spaces. The early work TAMER framework allows a non-technical human train an agent through a natural form of human feedback, negative or positive. The advantages of TAMER have been shown on applications such as training Tetris and Mountain Car with only human feedback, Cart-pole and Mountain Car with human feedback and environment reward (augmenting reinforcement learning with human feedback). However, those methods are originally designed for discrete state-action, or continuous state-discrete action problems. We propose an extension of TAMER to allow both continuous states and actions, called ACTAMER. The new framework extends the original TAMER to allow using any general function approximation of a human trainer´s reinforcement signal. Moreover, we investigate a combination capability of the ACTAMER and reinforcement learning (RL). The combination of human feedback and RL is studied in both settings: sequential and simultaneous. Our experimental results show the proposed method successfully allowing a human to train an agent in two continuous state-action domains: Mountain Car, Cart-pole (balancing).
Keywords :
learning (artificial intelligence); multi-agent systems; ACTAMER; Cart-pole; Mountain Car training; RL; TAMER framework; Tetris training; action spaces; augmenting reinforcement learning; continuous state-action domains; continuous state-discrete action problems; continuous states; discrete state-action; environment reward; evaluative reinforcement; human feedback; manually trained agents; reinforcement signal; Approximation algorithms; Function approximation; Humans; Learning; Training; Vectors;
Conference_Titel :
Development and Learning and Epigenetic Robotics (ICDL), 2012 IEEE International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
978-1-4673-4964-2
Electronic_ISBN :
978-1-4673-4963-5
DOI :
10.1109/DevLrn.2012.6400849