DocumentCode
3329368
Title
Leveraging Structure from Motion to Learn Discriminative Codebooks for Scalable Landmark Classification
Author
Bergamo, Alessandro ; Sinha, Sudipta N. ; Torresani, Lorenzo
Author_Institution
Dartmouth Coll., Hanover, NH, USA
fYear
2013
fDate
23-28 June 2013
Firstpage
763
Lastpage
770
Abstract
In this paper we propose a new technique for learning a discriminative codebook for local feature descriptors, specifically designed for scalable landmark classification. The key contribution lies in exploiting the knowledge of correspondences within sets of feature descriptors during code-book learning. Feature correspondences are obtained using structure from motion (SfM) computation on Internet photo collections which serve as the training data. Our codebook is defined by a random forest that is trained to map corresponding feature descriptors into identical codes. Unlike prior forest-based codebook learning methods, we utilize fine-grained descriptor labels and address the challenge of training a forest with an extremely large number of labels. Our codebook is used with various existing feature encoding schemes and also a variant we propose for importance-weighted aggregation of local features. We evaluate our approach on a public dataset of 25 landmarks and our new dataset of 620 landmarks (614K images). Our approach significantly outperforms the state of the art in landmark classification. Furthermore, our method is memory efficient and scalable.
Keywords
Internet; feature extraction; image classification; image coding; image motion analysis; learning (artificial intelligence); Internet photo collections; SfM; discriminative codebook learning; feature encoding schemes; fine-grained descriptor labels; identical codes; importance-weighted local feature aggregation; local feature descriptors; scalable landmark classification; structure-from-motion leverage; Cameras; Encoding; Feature extraction; Three-dimensional displays; Training; Vectors; Vegetation; dictionary; discriminative codebook; landmark classification; structure from motion;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Vision and Pattern Recognition (CVPR), 2013 IEEE Conference on
Conference_Location
Portland, OR
ISSN
1063-6919
Type
conf
DOI
10.1109/CVPR.2013.104
Filename
6618948
Link To Document