DocumentCode :
724497
Title :
One-shot learning gesture recognition based on improved 3D SMoSIFT feature descriptor from RGB-D videos
Author :
Jia Lin ; Xiaogang Ruan ; Naigong Yu ; Ruoyan Wei
Author_Institution :
Electron. Inf. & Control Eng. Coll., Beijing Univ. of Technol., Beijing, China
fYear :
2015
fDate :
23-25 May 2015
Firstpage :
4911
Lastpage :
4916
Abstract :
To satisfy the distinctive feature extraction requirement of one-shot learning gesture recognition for mobile robot control, a improved three-dimensional local sparse motion scale invariant feature transform (3D SMoSIFT) feature descriptor is proposed, which fuses RGB-D videos. Firstly, gray pyramid, depth pyramid and optical flow pyramids are built as scale space for each gray frame (converted from RGB frame) and depth frame. Then interest regions are extracted according the variance of optical flow, and variance is calculated in horizontal and vertical direction. Subsequently, corners are just extracted in each interest region as interest points, and then the information of gray and depth optical flow is simultaneously used to detect robust keypoints around the motion pattern in the scale space. Finally, SIFT descriptors are calculated on 3D gradient space and 3D motion space. The improved feature descriptor has been evaluated under a bag of feature model on one-shot learning Chalearn Gesture Dataset. Experiments demonstrate that the proposed method distinctly improves the accuracy of gesture recognition. The results also show that the improved 3D SMoSIFT feature descriptor surpasses other spatiotemporal feature descriptors and is comparable to the state-of-the-art approaches.
Keywords :
feature extraction; gesture recognition; image colour analysis; image fusion; image motion analysis; image sequences; learning (artificial intelligence); mobile robots; robot vision; video signal processing; 3D SMoSIFT feature descriptor; 3D gradient space; 3D motion space; RGB-D video fusion; RGB-D videos; SIFT descriptors; corner extraction; depth pyramid; feature extraction; gray pyramid; interest region extraction; mobile robot control; motion pattern; one-shot learning Chalearn gesture dataset; one-shot learning gesture recognition; optical flow pyramid; robust keypoint detection; three-dimensional local sparse motion scale invariant feature transform; Conferences; Gesture Recognition; One-shot Learning; RGB-D Data; Three dimensional Sparse Motion Scale-invariant Feature Transform (3D SMoSIFT); Variance of Optical Flow;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Control and Decision Conference (CCDC), 2015 27th Chinese
Conference_Location :
Qingdao
Print_ISBN :
978-1-4799-7016-2
Type :
conf
DOI :
10.1109/CCDC.2015.7162803
Filename :
7162803
Link To Document :
بازگشت