Body Surface Context: A New Robust Feature for Action Recognition From Depth Videos

Author

Yan Song ; Jinhui Tang ; Fan Liu ; Shuicheng Yan

Author_Institution

Sch. of Comput. Sci. & Technol., Nanjing Univ. of Sci. & Technol., Nanjing, China

Volume

24

Issue

6

fYear

2014

fDate

Jun-14

Firstpage

952

Lastpage

964

Abstract

Human action recognition in videos is useful for many applications. However, there still exist huge challenges in real applications due to the variations in the appearance, lighting condition and viewing angle, of the subjects. In this consideration, depth data have advantages over red, green, blue (RGB) data because of their spatial information about the distance between object and viewpoint. Unlike existing works, we utilize the 3-D point cloud, which contains points in the 3-D real-world coordinate system to represent the external surface of human body. Specifically, we propose a new robust feature, the body surface context (BSC), by describing the distribution of relative locations of the neighbors for a reference point in the point cloud in a compact and descriptive way. The BSC encodes the cylindrical angular of the difference vector based on the characteristics of human body, which increases the descriptiveness and discriminability of the feature. As the BSC is an approximate object-centered feature, it is robust to transformations including translations and rotations, which are very common in real applications. Furthermore, we propose three schemes to represent human actions based on the new feature, including the skeleton-based scheme, the random-reference-point scheme, and the spatial-temporal scheme. In addition, to evaluate the proposed feature, we construct a human action dataset by a depth camera. Experiments on three datasets demonstrate that the proposed feature outperforms RGB-based features and other existing depth-based features, which validates that the BSC feature is promising in the field of human action recognition.

Keywords

image motion analysis; image representation; object recognition; video signal processing; 3D point cloud; 3D real-world coordinate system; BSC; body surface context; depth camera; depth data; depth videos; difference vector; human action dataset; human action recognition; random reference point scheme; skeleton based scheme; spatial temporal scheme; Context; Feature extraction; Joints; Shape; Three-dimensional displays; Vectors; Depth video; Human action recognition; depth video; feature; human action recognition; point cloud;

fLanguage

English

Journal_Title

Circuits and Systems for Video Technology, IEEE Transactions on

Publisher

ieee

ISSN

1051-8215

Type

jour

DOI

10.1109/TCSVT.2014.2302558

Filename

6722961