Title :
Reward-penalty reinforcement learning scheme for planning and reactive behaviour
Author :
Araújo, Aluizio F R ; Braga, Arthur P S
Author_Institution :
Dept. of Electr. Eng., Sao Paulo Univ., Brazil
Abstract :
This paper describes a reinforcement learning algorithm that allows a point robot to learn navigation strategies within initially unknown indoor environments with fixed and dynamic obstacles. The knowledge is encoded in two surfaces, called reward and penalty surfaces, that are updated either when a target is found or whenever the robot moves respectively. The proposed policy is suitable for both planning and reactive behaviour. The tests involve different kinds of obstacles: a fixed passage, a barrier, a U-shape obstacle and a simple maze. The results suggest that the model solves the goal-directed exploration problem. Thus, the robot is able to reach a desired goal, starting its movement from any position within the environment, avoiding obstacles, and following a viable trajectory. The robot may get stuck in dynamic obstacles, may depend on randomness to avoid them, and generally does not solve the goal-directed reinforcement learning problem
Keywords :
collision avoidance; learning (artificial intelligence); mobile robots; path planning; problem solving; goal-directed exploration problem; mobile robot; navigation strategies; obstacle avoidance; path planning; penalty surfaces; point robot; reactive behaviour; reward surfaces; reward-penalty reinforcement learning; unknown indoor environments; Cognitive robotics; Indoor environments; Learning; Navigation; Path planning; Robot control; Robot sensing systems; Shape; State estimation; Testing;
Conference_Titel :
Systems, Man, and Cybernetics, 1998. 1998 IEEE International Conference on
Conference_Location :
San Diego, CA
Print_ISBN :
0-7803-4778-1
DOI :
10.1109/ICSMC.1998.728095