مرکز منطقه ای اطلاع رساني علوم و فناوري - One-shot learning gesture recognition based on improved 3D SMoSIFT feature descriptor from RGB-D videos

DocumentCode :

724497

Title :

One-shot learning gesture recognition based on improved 3D SMoSIFT feature descriptor from RGB-D videos

Author :

Jia Lin ; Xiaogang Ruan ; Naigong Yu ; Ruoyan Wei

Author_Institution :

Electron. Inf. & Control Eng. Coll., Beijing Univ. of Technol., Beijing, China

fYear :

2015

fDate :

23-25 May 2015

Firstpage :

4911

Lastpage :

4916

Abstract :

To satisfy the distinctive feature extraction requirement of one-shot learning gesture recognition for mobile robot control, a improved three-dimensional local sparse motion scale invariant feature transform (3D SMoSIFT) feature descriptor is proposed, which fuses RGB-D videos. Firstly, gray pyramid, depth pyramid and optical flow pyramids are built as scale space for each gray frame (converted from RGB frame) and depth frame. Then interest regions are extracted according the variance of optical flow, and variance is calculated in horizontal and vertical direction. Subsequently, corners are just extracted in each interest region as interest points, and then the information of gray and depth optical flow is simultaneously used to detect robust keypoints around the motion pattern in the scale space. Finally, SIFT descriptors are calculated on 3D gradient space and 3D motion space. The improved feature descriptor has been evaluated under a bag of feature model on one-shot learning Chalearn Gesture Dataset. Experiments demonstrate that the proposed method distinctly improves the accuracy of gesture recognition. The results also show that the improved 3D SMoSIFT feature descriptor surpasses other spatiotemporal feature descriptors and is comparable to the state-of-the-art approaches.

Keywords :

feature extraction; gesture recognition; image colour analysis; image fusion; image motion analysis; image sequences; learning (artificial intelligence); mobile robots; robot vision; video signal processing; 3D SMoSIFT feature descriptor; 3D gradient space; 3D motion space; RGB-D video fusion; RGB-D videos; SIFT descriptors; corner extraction; depth pyramid; feature extraction; gray pyramid; interest region extraction; mobile robot control; motion pattern; one-shot learning Chalearn gesture dataset; one-shot learning gesture recognition; optical flow pyramid; robust keypoint detection; three-dimensional local sparse motion scale invariant feature transform; Conferences; Gesture Recognition; One-shot Learning; RGB-D Data; Three dimensional Sparse Motion Scale-invariant Feature Transform (3D SMoSIFT); Variance of Optical Flow;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Control and Decision Conference (CCDC), 2015 27th Chinese

Conference_Location :

Qingdao

Print_ISBN :

978-1-4799-7016-2

Type :

conf

DOI :

10.1109/CCDC.2015.7162803

Filename :

7162803

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=724497