مرکز منطقه ای اطلاع رساني علوم و فناوري - 3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image

DocumentCode :

639552

Title :

3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image

Author :

Chakraborty, Imon ; Hui Cheng ; Javed, Omar

Author_Institution :

SRI Int., Princeton, NJ, USA

fYear :

2013

fDate :

23-28 June 2013

Firstpage :

3406

Lastpage :

3413

Abstract :

We present a unified framework for detecting and classifying people interactions in unconstrained user generated images. Unlike previous approaches that directly map people/face locations in 2D image space into features for classification, we first estimate camera viewpoint and people positions in 3D space and then extract spatial configuration features from explicit 3D people positions. This approach has several advantages. First, it can accurately estimate relative distances and orientations between people in 3D. Second, it encodes spatial arrangements of people into a richer set of shape descriptors than afforded in 2D. Our 3D shape descriptors are invariant to camera pose variations often seen in web images and videos. The proposed approach also estimates camera pose and uses it to capture the intent of the photo. To achieve accurate 3D people layout estimation, we develop an algorithm that robustly fuses semantic constraints about human interpositions into a linear camera model. This enables our model to handle large variations in people size, heights (e.g. age) and poses. An accurate 3D layout also allows us to construct features informed by Proxemics that improves our semantic classification. To characterize the human interaction space, we introduce visual proxemes, a set of prototypical patterns that represent commonly occurring social interactions in events. We train a discriminative classifier that classifies 3D arrangements of people into visual proxemes and quantitatively evaluate the performance on a large, challenging dataset.

Keywords :

feature extraction; human computer interaction; image classification; image sensors; object recognition; pose estimation; shape recognition; 3D people layout estimation; 3D shape descriptors; 3D space; 3D visual proxemics; camera pose variations; camera viewpoint estimation; discriminative classifier; human interaction recognition; people interaction classification; people interaction detection; people position estimation; relative distance estimation; relative orientation estimation; semantic classification; social interactions; spatial configuration feature extraction; unconstrained user generated images; Cameras; Face; Robustness; Shape; Three-dimensional displays; Videos; Visualization; 3D people layout; RANSAC; Visual Proxemics; semantic constraints;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on

Conference_Location :

Portland, OR

ISSN :

1063-6919

Type :

conf

DOI :

10.1109/CVPR.2013.437

Filename :

6619281

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=639552