DocumentCode :
2088019
Title :
Using Language to Drive the Perceptual Grouping of Local Image Features
Author :
Jamieson, Michael ; Dickinson, Sven ; Stevenson, Suzanne ; Wachsmuth, Sven
Author_Institution :
University of Toronto
Volume :
2
fYear :
2006
fDate :
2006
Firstpage :
2102
Lastpage :
2109
Abstract :
We address the problem of learning both the semantics (names) and the visual features (SIFT collections) of objects appearing in a training set of unstructured, captioned images of cluttered scenes. Prior work in applying machine translation models to learn the associations between image features and caption nouns has assumed a one-toone correspondence between features and nouns. However, each training image may contain thousands of SIFT features belonging to multiple objects. Our challenge is two-fold: 1) grouping the SIFT features into meaningful collections, and 2) learning the object names associated with those collections. Since better collections tend to have stronger associations with object names, we offer an integrated solution that uses the caption words to drive the feature grouping process. The result is a more general model acquisition framework that does not assume words correspond to individual features and does not require training images with isolated objects or unambiguous labels. The model that is learned performs well at labeling cluttered scenes in a set of test images.
Keywords :
Background noise; Feature extraction; Image recognition; Image representation; Image segmentation; Labeling; Layout; Performance evaluation; Pixel; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition, 2006 IEEE Computer Society Conference on
ISSN :
1063-6919
Print_ISBN :
0-7695-2597-0
Type :
conf
DOI :
10.1109/CVPR.2006.325
Filename :
1641011
Link To Document :
بازگشت