مرکز منطقه ای اطلاع رساني علوم و فناوري - Modeling the Relationship of Action, Object, and Scene

DocumentCode :

178104

Title :

Modeling the Relationship of Action, Object, and Scene

Author :

Jing Liu ; Xinxiao Wu ; Yang Feng

Author_Institution :

Beijing Lab. of Intell. Inf. Technol., Beijing Inst. of Technol., Beijing, China

fYear :

2014

fDate :

24-28 Aug. 2014

Firstpage :

2005

Lastpage :

2010

Abstract :

In the task of action recognition, object and scene can provide rich source of contextual information for analyzing human actions, as human actions often occur under particular scene settings with certain related objects. Therefore, we try to utilize the contextual object and scene for improving the performance of action recognition. Specifically, a latent structural SVM is introduced to build the co-occurrence relationship among action, object and scene, in which the object class label and scene class label are treated as latent variables. Using this framework, we can simultaneously predict action class labels, object class labels as well as scene class labels. Moreover, we use a mid-level discriminative feature to separately describe the information of action, object and scene. The feature is actually a set of decision values from the pre-learned classifiers of each class, measuring the likelihood that the input video belongs to the corresponding class. In this paper, we use SVM as action and scene pre-learned classifiers, and use deformable part-based object detector as the object pre-learned classifier, so that object location can be obtained as a by-product. Experimental results on UCF Sports, YouTube and UCF50 datasets demonstrate the effectiveness of the proposed approach.

Keywords :

image classification; image motion analysis; image recognition; object detection; support vector machines; video signal processing; UCF Sports; UCF50 datasets; YouTube; action pre-learned classifiers; action recognition; deformable part-based object detector; latent structural SVM; object pre-learned classifier; scene pre-learned classifiers; Accuracy; Context; Context modeling; Correlation; Feature extraction; Training; YouTube; LSSVM; action recognition; context modeling; object detection; scene recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Pattern Recognition (ICPR), 2014 22nd International Conference on

Conference_Location :

Stockholm

ISSN :

1051-4651

Type :

conf

DOI :

10.1109/ICPR.2014.350

Filename :

6977062

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=178104