مرکز منطقه ای اطلاع رساني علوم و فناوري - Selection and context for action recognition

DocumentCode :

2292697

Title :

Selection and context for action recognition

Author :

Han, Dong ; Bo, Liefeng ; Sminchisescu, Cristian

Author_Institution :

Univ. of Bonn, Bonn, Germany

fYear :

2009

fDate :

Sept. 29 2009-Oct. 2 2009

Firstpage :

1933

Lastpage :

1940

Abstract :

Recognizing human action in non-instrumented video is a challenging task not only because of the variability produced by general scene factors like illumination, background, occlusion or intra-class variability, but also because of subtle behavioral patterns among interacting people or between people and objects in images. To improve recognition, a system may need to use not only low-level spatio-temporal video correlations but also relational descriptors between people and objects in the scene. In this paper we present contextual scene descriptors and Bayesian multiple kernel learning methods for recognizing human action in complex non-instrumented video. Our contribution is threefold: (1) we introduce bag-of-detector scene descriptors that encode presence/absence and structural relations between object parts; (2) we derive a novel Bayesian classification method based on Gaussian processes with multiple kernel covariance functions (MKGPC), in order to automatically select and weight multiple features, both low-level and high-level, out of a large collection, in a principled way, and (3) perform large scale evaluation using a variety of features on the KTH and a recently introduced, challenging, Hollywood movie dataset. On the KTH dataset, we obtain 94.1% accuracy, the best result reported to date. On the Hollywood dataset we obtain promising results in several action classes using fewer descriptors and about 9.1% improvement in a previous benchmark test.

Keywords :

Bayes methods; Gaussian processes; covariance analysis; image classification; image coding; image motion analysis; learning (artificial intelligence); video signal processing; Bayesian classification method; Bayesian multiple kernel learning method; Gaussian process; Hollywood movie dataset; KTH dataset; absence encoding; bag-of-detector scene descriptors; contextual scene descriptors; human action recognition; low-level spatio-temporal video correlations; multiple kernel covariance function; noninstrumented video; presence encoding; relational descriptors; structural relation encoding; Bayesian methods; Gaussian processes; Humans; Image recognition; Kernel; Layout; Learning systems; Lighting; Pattern recognition; Performance evaluation;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Vision, 2009 IEEE 12th International Conference on

Conference_Location :

Kyoto

ISSN :

1550-5499

Print_ISBN :

978-1-4244-4420-5

Electronic_ISBN :

1550-5499

Type :

conf

DOI :

10.1109/ICCV.2009.5459427

Filename :

5459427

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2292697