Title :
Ensuring safety of policies learned by reinforcement: Reaching objects in the presence of obstacles with the iCub
Author :
Pathak, Sant ; Pulina, Luca ; Metta, G. ; Tacchella, Armando
Author_Institution :
Robot., Brain & Cognitive Sci. (RBCS), Ist. Italiano di Tecnol. (IIT), Genoa, Italy
Abstract :
Given a stochastic policy learned by reinforcement, we wish to ensure that it can be deployed on a robot with demonstrably low probability of unsafe behavior. Our case study is about learning to reach target objects positioned close to obstacles, and ensuring a reasonably low collision probability. Learning is carried out in a simulator to avoid physical damage in the trial-and-error phase. Once a policy is learned, we analyze it with probabilistic model checking tools to identify and correct potential unsafe behaviors. The whole process is automated and, in principle, it can be integrated step-by-step with routine task-learning. As our results demonstrate, automated fixing of policies is both feasible and highly effective in bounding the probability of unsafe behaviors.
Keywords :
collision avoidance; humanoid robots; learning (artificial intelligence); probability; stochastic processes; collision probability; iCub; obstacle avoidance; probabilistic model checking tool; reinforcement learning; routine task-learning; stochastic policy; Collision avoidance; Maintenance engineering; Markov processes; Model checking; Probabilistic logic; Robots; Safety;
Conference_Titel :
Intelligent Robots and Systems (IROS), 2013 IEEE/RSJ International Conference on
Conference_Location :
Tokyo
DOI :
10.1109/IROS.2013.6696349