Structure context of local features in realistic human action recognition

Author

Wu, Qiuxia ; Lu, Shiyang ; Wang, Zhiyong ; Deng, Feiqi ; Kang, Wenxiong ; Feng, David Dagan

fYear

2011

fDate

6-13 Nov. 2011

Firstpage

1496

Lastpage

1501

Abstract

Realistic human action recognition has been emerging as a challenging research topic due to the difficulties of representing different human actions in diverse realistic scenes. In the bag-of-features model, human actions are generally represented with the distribution of local features derived from the keypoints of action videos. Various local features have been proposed to characterize those key points. However, the important structural information among the key points has not been well investigated yet. In this paper, we propose to characterize such structure information with shape context. Therefore, each keypoint is characterized with both its local visual attributes and its global structural context contributed by other keypoints. The bag-of-features model is utilized for representing each human action and SVM is employed to perform human action recognition. Experimental results on the challenging YouTube dataset and HOHA-2 dataset demonstrate that our proposed approach accounting for structural information is more effective in representing realistic human actions. In addition, we also investigate the impact of choosing different local features such as SIFT, HOG, and HOF descriptors in human action representation. It is observed that dense keypoints can better exploit the advantages of our proposed approach.

Keywords

feature extraction; image recognition; support vector machines; video signal processing; HOF descriptor; HOG descriptor; SIFT descriptor; SVM; action video; bag-of-features model; histogram-of-gradients; histogram-of-optical flow; human action representation; local feature distribution; local feature structure; realistic human action recognition; scale invariant feature transform; support vector machines; visual attribute; Bars; Context; Humans; Shape; Vectors; Videos; YouTube;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on

Conference_Location

Barcelona

Print_ISBN

978-1-4673-0062-9

Type

conf

DOI

10.1109/ICCVW.2011.6130427

Filename

6130427