شماره ركورد كنفرانس :
3862
عنوان مقاله :
Learning of a task despite credit assignment problem using deep representation learning with less trials
پديدآورندگان :
Davari Dolatabadi MohammadJavad University of Tehran , Alipour Khalil k.alipour@ut.ac.ir University of Tehran , Hadi Alireza University of Tehran
كليدواژه :
Deep Learning , Push Recovery , Credit Assignment Problem , Latent Variable , Rewarding System , Inverse Reinforcement Learning.
عنوان كنفرانس :
بيست و پنجمين كنفرانس سالانه بين المللي مهندسي مكانيك
چكيده فارسي :
In this paper, we present three new methods to accelerate the learning of a task by deterministic policy gradient algorithm. We focus specifically on learning of a task, which has the Credit Assignment (CA) problem. A Reinforcement Learning (RL) agent is performing this task in high dimensional state-space. The main idea of this paper is to use latent variables that deep autoencoders provide, to make a better rewarding system. We show that using these new rewards helps tremendously to learn the task in the similar circumstances. The task chosen for the algorithm is Push Recovery (PR) in a simulated environment.