DocumentCode
2713182
Title
Exploiting nonlocal spatiotemporal structure for video segmentation
Author
Hsien-Ting Cheng ; Ahuja, N.
Author_Institution
Dept. of Electr. & Comput. Eng., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
fYear
2012
fDate
16-21 June 2012
Firstpage
741
Lastpage
748
Abstract
Unsupervised video segmentation is a challenging problem because it involves a large amount of data, and image segments undergo noisy variations in color, texture and motion with time. However, there are significant redundancies that can help disambiguate the effects of noise. To exploit these redundancies and obtain the most spatio-temporally consistent video segmentation, we formulate the problem as a consistent labeling problem by exploiting higher order image structure. A label stands for a specific moving segment. Each segment (or region) is treated as a random variable which is to be assigned a label. Regions assigned the same label comprise a 3D space-time segment, or a region tube. The labels can also be automatically created or terminated at any frame in the video sequence, to allow objects entering or leaving the scene. To formulate this problem, we use the CRF (conditional random field) model. Unlike conventional CRF which has only unary and binary potentials, we also use higher order potentials to favor label consistency among disconnected spatial and temporal segments. Compared to region tracking based methods, the main advantages of the proposed algorithm are two fold: (1) the label consistency constraints are imposed on multiple regions but in a soft manner, and (2) the labeling decision is postponed until the confidence in the labeling is high. We compare our results with a recent state-of-the-art video segmentation algorithm and show that our results are quantitatively and qualitatively better.
Keywords
image colour analysis; image motion analysis; image segmentation; image sequences; image texture; random processes; video signal processing; 3D space-time segment; CRF; conditional random field model; higher order image structure; image color; image motion; image segmentation; image texture; label consistency constraint; labeling decision; labeling problem; moving segment; noisy variation; nonlocal spatiotemporal structure; random variable; region tracking; region tube; spatial segments; temporal segments; unsupervised video segmentation; video sequence; Electron tubes; Image segmentation; Labeling; Merging; Random variables; Redundancy; Robustness;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on
Conference_Location
Providence, RI
ISSN
1063-6919
Print_ISBN
978-1-4673-1226-4
Electronic_ISBN
1063-6919
Type
conf
DOI
10.1109/CVPR.2012.6247744
Filename
6247744
Link To Document