Articulated pose estimation with tiny synthetic videos

Author

Dennis Park;Deva Ramanan

Author_Institution

UC Irvine, CA 92697, United States

fYear

2015

fDate

6/1/2015 12:00:00 AM

Firstpage

Lastpage

Abstract

We address the task of articulated pose estimation from video sequences. We consider an interactive setting where the initial pose is annotated in the first frame. Our system synthesizes a large number of hypothetical scenes with different poses and camera positions by applying geometric deformations to the first frame. We use these synthetic images to generate a custom labeled training set for the video in question. This training data is then used to learn a regressor (for future frames) that predicts joint locations from image data. Notably, our training set is so accurate that nearest-neighbor (NN) matching on low-resolution pixel features works well. As such, we name our underlying representation “tiny synthetic videos”. We present quantitative results the Friends benchmark dataset that suggests our simple approach matches or exceed state-of-the-art.

Keywords

"Videos","Training","Engines","Image resolution","Rendering (computer graphics)","Joints"

Publisher

ieee

Conference_Titel

Computer Vision and Pattern Recognition Workshops (CVPRW), 2015 IEEE Conference on

Electronic_ISBN

2160-7516

Type

conf

DOI

10.1109/CVPRW.2015.7301337

Filename

7301337

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3673962