Title :
Local descriptions for human action recognition from 3D reconstruction data
Author :
Papadopoulos, Georgios T. ; Daras, Petros
Author_Institution :
Centre for Res. & Technol., Inf. Technol. Inst., Hellas, Greece
Abstract :
In this paper, a view-invariant approach to human action recognition using 3D reconstruction data is proposed. Initially, a set of calibrated Kinect sensors are employed for producing a 3D reconstruction of the performing subjects. Subsequently, a 3D flow field is estimated for every captured frame. For performing action recognition, the `Bag-of-Words´ methodology is followed, where Spatio-Temporal Interest Points (STIPs) are detected in the 4D space (xyz-coordinates plus time). A novel local-level 3D flow descriptor is introduced, which among others incorporates spatial and surface information in the flow representation and efficiently handles the problem of defining 3D orientation at every STIP location. Additionally, typical 3D shape descriptors of the literature are used for producing a more complete representation. Experimental results as well as comparative evaluation using datasets from the Huawei/3DLife 3D human reconstruction and action recognition Grand Challenge demonstrate the efficiency of the proposed approach.
Keywords :
gesture recognition; image reconstruction; image representation; image sensors; 3D flow field; 3D orientation; 3D reconstruction data; 3D shape descriptors; 4D space; STIP; bag-of-words methodology; calibrated Kinect sensors; flow representation; human action recognition; local descriptions; local-level 3D flow descriptor; spatial information; spatio-temporal interest points; surface information; view-invariant approach; Computer vision; Estimation; Noise; Shape; Three-dimensional displays; Vectors; Videos; 3D flow; 3D reconstruction; Action recognition; Kinect; view-invariance;
Conference_Titel :
Image Processing (ICIP), 2014 IEEE International Conference on
Conference_Location :
Paris
DOI :
10.1109/ICIP.2014.7025569