DocumentCode
1530031
Title
Duplicate-Search-Based Image Annotation Using Web-Scale Data
Author
Wang, Xin-Jing ; Zhang, Lei ; Ma, Wei-Ying
Author_Institution
Microsoft Research Asia, Haidian District, Beijing, China
Volume
100
Issue
9
fYear
2012
Firstpage
2705
Lastpage
2721
Abstract
Easy photo-taking and photo-sharing today make image an increasingly important type of media in people´s everyday life, which arouses a growing demand for a practical image understanding technique. Traditional computer vision or machine learning methods which learn models based on a set of training data are still in the stage of tackling hundreds of object categories. Such a scale is far from practical usage. In recent years, the technique of search-based image annotation on a large-scale data set has demonstrated great success. Rather than directly mapping visual features to texts which is inevitably hindered by the semantic gap, it understands the content of an image by propagating labels of its similar images in a large-scale data set. Since similarity search is performed among homogenous data, the difficulty is greatly reduced. This paper summarizes the extensive work on web image annotation using the large-scale metadata and social information available on the Web, and introduces the Arista system, which is a nonparametric image annotation platform built upon two billion web images. We propose a highly efficient and scalable duplicate-search technique so that the Arista system can be deployed on a few servers. A few interesting applications such as building large-scale celebrity face database and text-to-image translation are also presented in this paper.
Keywords
Databases; Feature extraction; Image classification; Information retrieval; Measurement; Semantics; Text mining; Automatic image annotation; duplicate-search-based image annotation;
fLanguage
English
Journal_Title
Proceedings of the IEEE
Publisher
ieee
ISSN
0018-9219
Type
jour
DOI
10.1109/JPROC.2012.2193109
Filename
6210348
Link To Document