مرکز منطقه ای اطلاع رساني علوم و فناوري - Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation

DocumentCode :

3748758

Title :

Maximum-Margin Structured Learning with Deep Networks for 3D Human Pose Estimation

Author :

Sijin Li;Weichen Zhang;Antoni B. Chan

Author_Institution :

Dept. of Comput. Sci., City Univ. of Hong Kong, Hong Kong, China

fYear :

2015

Firstpage :

2848

Lastpage :

2856

Abstract :

This paper focuses on structured-output learning using deep neural networks for 3D human pose estimation from monocular images. Our network takes an image and 3D pose as inputs and outputs a score value, which is high when the image-pose pair matches and low otherwise. The network structure consists of a convolutional neural network for image feature extraction, followed by two sub-networks for transforming the image features and pose into a joint embedding. The score function is then the dot-product between the image and pose embeddings. The image-pose embedding and score function are jointly trained using a maximum-margin cost function. Our proposed framework can be interpreted as a special form of structured support vector machines where the joint feature space is discriminatively learned using deep neural networks. We test our framework on the Human3.6m dataset and obtain state-of-the-art results compared to other recent methods. Finally, we present visualizations of the image-pose embedding space, demonstrating the network has learned a high-level embedding of body-orientation and pose-configuration.

Keywords :

"Feature extraction","Three-dimensional displays","Neural networks","Cost function","Support vector machines","Training"

Publisher :

ieee

Conference_Titel :

Computer Vision (ICCV), 2015 IEEE International Conference on

Electronic_ISBN :

2380-7504

Type :

conf

DOI :

10.1109/ICCV.2015.326

Filename :

7410683

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3748758