DocumentCode
3021025
Title
Structure context of local features in realistic human action recognition
Author
Wu, Qiuxia ; Lu, Shiyang ; Wang, Zhiyong ; Deng, Feiqi ; Kang, Wenxiong ; Feng, David Dagan
fYear
2011
fDate
6-13 Nov. 2011
Firstpage
1496
Lastpage
1501
Abstract
Realistic human action recognition has been emerging as a challenging research topic due to the difficulties of representing different human actions in diverse realistic scenes. In the bag-of-features model, human actions are generally represented with the distribution of local features derived from the keypoints of action videos. Various local features have been proposed to characterize those key points. However, the important structural information among the key points has not been well investigated yet. In this paper, we propose to characterize such structure information with shape context. Therefore, each keypoint is characterized with both its local visual attributes and its global structural context contributed by other keypoints. The bag-of-features model is utilized for representing each human action and SVM is employed to perform human action recognition. Experimental results on the challenging YouTube dataset and HOHA-2 dataset demonstrate that our proposed approach accounting for structural information is more effective in representing realistic human actions. In addition, we also investigate the impact of choosing different local features such as SIFT, HOG, and HOF descriptors in human action representation. It is observed that dense keypoints can better exploit the advantages of our proposed approach.
Keywords
feature extraction; image recognition; support vector machines; video signal processing; HOF descriptor; HOG descriptor; SIFT descriptor; SVM; action video; bag-of-features model; histogram-of-gradients; histogram-of-optical flow; human action representation; local feature distribution; local feature structure; realistic human action recognition; scale invariant feature transform; support vector machines; visual attribute; Bars; Context; Humans; Shape; Vectors; Videos; YouTube;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on
Conference_Location
Barcelona
Print_ISBN
978-1-4673-0062-9
Type
conf
DOI
10.1109/ICCVW.2011.6130427
Filename
6130427
Link To Document