Title :
Learning of gestures by imitation using a monocular vision system on a humanoid robot
Author :
Sabbaghi, Elaheh ; Bahrami, M. ; Ghidary, Saeed Shiry
Author_Institution :
Robot. Res. Inst., Amirkabir Univ. of Technol., Tehran, Iran
Abstract :
In this paper, we present a vision-based imitation learning system that enables the humanoid robot to recognize and learn upper body gestures using only single camera mounted on the robot head, without forcing demonstrator (human/robot) to wear a certain color or patches. Insufficient information in monocular images and the complicated nature of human motion, make the 3D human pose reconstruction challenging. In the motion capture system, a learning-based method is used to recover 3D pose of upper body (shoulder and elbow joint angles) from monocular images by direct nonlinear regression against shape descriptor vectors extracted from image silhouettes. The goal of proposed imitation learning system is generalization over multiple demonstrations and learning them. To achieve this goal, a set of joint angle trajectories used for training, first project to a latent space with PCA and then align temporally using DTW. Obtained signals are modeled using Gaussian distributions (GMM) and then generalized using Gaussian mixture regression. This system is evaluated qualitatively on human3.6m dataset and quantitatively on sequence of images captured by robot camera. The system is also implemented and tested on the Nao humanoid robot. Experimental results show that the proposed system is capable to effectively perceive, recognize and learn demonstrated gestures in real scenarios.
Keywords :
Gaussian processes; computer graphics; gesture recognition; humanoid robots; mixture models; motion estimation; pose estimation; regression analysis; robot vision; 3D human pose reconstruction; GMM; Gaussian distributions; Gaussian mixture regression; Nao humanoid robot; PCA; body gestures; complicated nature; forcing demonstrator; human motion; image silhouettes; latent space; learning-based method; monocular images; monocular vision system; motion capture system; nonlinear regression; robot camera; robot head; shape descriptor vectors; single camera; vision-based imitation learning system; Estimation; Joints; Learning systems; Robot sensing systems; Three-dimensional displays; Trajectory; Gaussian mixture model; Gaussian mixture regression; human pose estimation; imitation learning; monocular vision;
Conference_Titel :
Robotics and Mechatronics (ICRoM), 2014 Second RSI/ISM International Conference on
Conference_Location :
Tehran
DOI :
10.1109/ICRoM.2014.6990966