DocumentCode :
1335349
Title :
Reading between the Lines: Object Localization Using Implicit Cues from Image Tags
Author :
Hwang, Sung Ju ; Grauman, Kristen
Author_Institution :
Dept. of Comput. Sci., Univ. of Texas at Austin, Austin, TX, USA
Volume :
34
Issue :
6
fYear :
2012
fDate :
6/1/2012 12:00:00 AM
Firstpage :
1145
Lastpage :
1158
Abstract :
Current uses of tagged images typically exploit only the most explicit information: the link between the nouns named and the objects present somewhere in the image. We propose to leverage “unspoken” cues that rest within an ordered list of image tags so as to improve object localization. We define three novel implicit features from an image´s tags-the relative prominence of each object as signified by its order of mention, the scale constraints implied by unnamed objects, and the loose spatial links hinted at by the proximity of names on the list. By learning a conditional density over the localization parameters (position and scale) given these cues, we show how to improve both accuracy and efficiency when detecting the tagged objects. Furthermore, we show how the localization density can be learned in a semantic space shared by the visual and tag-based features, which makes the technique applicable for detection in untagged input images. We validate our approach on the PASCAL VOC, LabelMe, and Flickr image data sets, and demonstrate its effectiveness relative to both traditional sliding windows as well as a visual context baseline. Our algorithm improves state-of-the-art methods, successfully translating insights about human viewing behavior (such as attention, perceived importance, or gaze) into enhanced object detection.
Keywords :
feature extraction; object detection; social networking (online); Flickr image data sets; LabelMe image data sets; PASCAL VOC image data sets; conditional density; image tags; implicit cues; implicit features; localization density; localization parameters; loose spatial links; object detection; object localization; object relative prominence; scale constraints; semantic space; sliding windows; tag-based features; visual context baseline; visual-based features; Context; Correlation; Detectors; Feature extraction; Semantics; Training; Visualization; Object detection; context.; image tags; object recognition; Algorithms; Cues; Humans; Pattern Recognition, Visual; Semantics;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/TPAMI.2011.190
Filename :
6030877
Link To Document :
بازگشت