Title :
Towards semantic embedding in visual vocabulary
Author :
Ji, Rongrong ; Yao, Hongxun ; Sun, Xiaoshuai ; Zhong, Bineng ; Gao, Wen
Author_Institution :
Harbin Inst. of Technol., Harbin, China
Abstract :
Visual vocabulary serves as a fundamental component in many computer vision tasks, such as object recognition, visual search, and scene modeling. While state-of-the-art approaches build visual vocabulary based solely on visual statistics of local image patches, the correlative image labels are left unexploited in generating visual words. In this work, we present a semantic embedding framework to integrate semantic information from Flickr labels for supervised vocabulary construction. Our main contribution is a Hidden Markov Random Field modeling to supervise feature space quantization, with specialized considerations to label correlations: Local visual features are modeled as an Observed Field, which follows visual metrics to partition feature space. Semantic labels are modeled as a Hidden Field, which imposes generative supervision to the Observed Field with WordNet-based correlation constraints as Gibbs distribution. By simplifying the Markov property in the Hidden Field, both unsupervised and supervised (label independent) vocabularies can be derived from our framework. We validate our performances in two challenging computer vision tasks with comparisons to state-of-the-arts: (1) Large-scale image search on a Flickr 60,000 database; (2) Object recognition on the PASCAL VOC database.
Keywords :
computer vision; hidden Markov models; vocabulary; Flickr label correlation; Gibb distribution; PASCAL VOC database; WordNet-based correlation constraints; computer vision; feature space quantization supervision; hidden Markov random field modeling; hidden field model; large-scale image search; local image patches; local visual features; object recognition; observed field model; scene modeling; semantic embedding framework; supervised vocabulary; supervised vocabulary construction; unsupervised vocabulary; visual search; visual statistics; visual vocabulary; Computer vision; Extraterrestrial measurements; Hidden Markov models; Image databases; Large-scale systems; Layout; Object recognition; Quantization; Statistics; Vocabulary;
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on
Conference_Location :
San Francisco, CA
Print_ISBN :
978-1-4244-6984-0
DOI :
10.1109/CVPR.2010.5540118