DocumentCode
254261
Title
Tell Me What You See and I Will Show You Where It Is
Author
Jia Xu ; Schwing, Alexander Gerhard ; Urtasun, Raquel
fYear
2014
fDate
23-28 June 2014
Firstpage
3190
Lastpage
3197
Abstract
We tackle the problem of weakly labeled semantic segmentation, where the only source of annotation are image tags encoding which classes are present in the scene. This is an extremely difficult problem as no pixel-wise labelings are available, not even at training time. In this paper, we show that this problem can be formalized as an instance of learning in a latent structured prediction framework, where the graphical model encodes the presence and absence of a class as well as the assignments of semantic labels to superpixels. As a consequence, we are able to leverage standard algorithms with good theoretical properties. We demonstrate the effectiveness of our approach using the challenging SIFT-flow dataset and show average per-class accuracy improvements of 7% over the state-of-the-art.
Keywords
graph theory; image segmentation; learning (artificial intelligence); transforms; SIFT-flow dataset; graphical model; image tags; latent structured prediction framework; scale-invariant feature transforms; semantic segmentation; Accuracy; Graphical models; Image segmentation; Labeling; Semantics; Training; Vectors; Graphical Model; Semantic Segmentation; Structured Prediction; Weakly Supervised Learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on
Conference_Location
Columbus, OH
Type
conf
DOI
10.1109/CVPR.2014.408
Filename
6909804
Link To Document