DocumentCode
2541326
Title
Discovering objects and their location in images
Author
Sivic, Josef ; Russell, Bryan C. ; Efros, Alexei A. ; Zisserman, Andrew ; Freeman, William T.
Author_Institution
Dept. of Eng. Sci., Oxford Univ., UK
Volume
1
fYear
2005
fDate
17-21 Oct. 2005
Firstpage
370
Abstract
We seek to discover the object categories depicted in a set of unlabelled images. We achieve this using a model developed in the statistical text literature: probabilistic latent semantic analysis (pLSA). In text analysis, this is used to discover topics in a corpus using the bag-of-words document representation. Here we treat object categories as topics, so that an image containing instances of several categories is modeled as a mixture of topics. The model is applied to images by using a visual analogue of a word, formed by vector quantizing SIFT-like region descriptors. The topic discovery approach successfully translates to the visual domain: for a small set of objects, we show that both the object categories and their approximate spatial layout are found without supervision. Performance of this unsupervised method is compared to the supervised approach of Fergus et al. (2003) on a set of unseen images containing only one object per image. We also extend the bag-of-words vocabulary to include ´doublets´ which encode spatially local co-occurring regions. It is demonstrated that this extended vocabulary gives a cleaner image segmentation. Finally, the classification and segmentation methods are applied to a set of images containing multiple objects per image. These results demonstrate that we can successfully build object class models from an unsupervised analysis of images.
Keywords
image classification; image representation; image segmentation; vector quantisation; SIFT-like region descriptor; image classification; image segmentation; object category; object discovery; object location; probabilistic latent semantic analysis; text analysis; topic discovery approach; unsupervised analysis; word visual analogue; Clustering algorithms; Computer vision; Frequency; Graphical models; Histograms; Image segmentation; Labeling; Object detection; Random variables; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
ISSN
1550-5499
Print_ISBN
0-7695-2334-X
Type
conf
DOI
10.1109/ICCV.2005.77
Filename
1541280
Link To Document