DocumentCode :
178104
Title :
Modeling the Relationship of Action, Object, and Scene
Author :
Jing Liu ; Xinxiao Wu ; Yang Feng
Author_Institution :
Beijing Lab. of Intell. Inf. Technol., Beijing Inst. of Technol., Beijing, China
fYear :
2014
fDate :
24-28 Aug. 2014
Firstpage :
2005
Lastpage :
2010
Abstract :
In the task of action recognition, object and scene can provide rich source of contextual information for analyzing human actions, as human actions often occur under particular scene settings with certain related objects. Therefore, we try to utilize the contextual object and scene for improving the performance of action recognition. Specifically, a latent structural SVM is introduced to build the co-occurrence relationship among action, object and scene, in which the object class label and scene class label are treated as latent variables. Using this framework, we can simultaneously predict action class labels, object class labels as well as scene class labels. Moreover, we use a mid-level discriminative feature to separately describe the information of action, object and scene. The feature is actually a set of decision values from the pre-learned classifiers of each class, measuring the likelihood that the input video belongs to the corresponding class. In this paper, we use SVM as action and scene pre-learned classifiers, and use deformable part-based object detector as the object pre-learned classifier, so that object location can be obtained as a by-product. Experimental results on UCF Sports, YouTube and UCF50 datasets demonstrate the effectiveness of the proposed approach.
Keywords :
image classification; image motion analysis; image recognition; object detection; support vector machines; video signal processing; UCF Sports; UCF50 datasets; YouTube; action pre-learned classifiers; action recognition; deformable part-based object detector; latent structural SVM; object pre-learned classifier; scene pre-learned classifiers; Accuracy; Context; Context modeling; Correlation; Feature extraction; Training; YouTube; LSSVM; action recognition; context modeling; object detection; scene recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2014 22nd International Conference on
Conference_Location :
Stockholm
ISSN :
1051-4651
Type :
conf
DOI :
10.1109/ICPR.2014.350
Filename :
6977062
Link To Document :
بازگشت