• DocumentCode
    1530031
  • Title

    Duplicate-Search-Based Image Annotation Using Web-Scale Data

  • Author

    Wang, Xin-Jing ; Zhang, Lei ; Ma, Wei-Ying

  • Author_Institution
    Microsoft Research Asia, Haidian District, Beijing, China
  • Volume
    100
  • Issue
    9
  • fYear
    2012
  • Firstpage
    2705
  • Lastpage
    2721
  • Abstract
    Easy photo-taking and photo-sharing today make image an increasingly important type of media in people´s everyday life, which arouses a growing demand for a practical image understanding technique. Traditional computer vision or machine learning methods which learn models based on a set of training data are still in the stage of tackling hundreds of object categories. Such a scale is far from practical usage. In recent years, the technique of search-based image annotation on a large-scale data set has demonstrated great success. Rather than directly mapping visual features to texts which is inevitably hindered by the semantic gap, it understands the content of an image by propagating labels of its similar images in a large-scale data set. Since similarity search is performed among homogenous data, the difficulty is greatly reduced. This paper summarizes the extensive work on web image annotation using the large-scale metadata and social information available on the Web, and introduces the Arista system, which is a nonparametric image annotation platform built upon two billion web images. We propose a highly efficient and scalable duplicate-search technique so that the Arista system can be deployed on a few servers. A few interesting applications such as building large-scale celebrity face database and text-to-image translation are also presented in this paper.
  • Keywords
    Databases; Feature extraction; Image classification; Information retrieval; Measurement; Semantics; Text mining; Automatic image annotation; duplicate-search-based image annotation;
  • fLanguage
    English
  • Journal_Title
    Proceedings of the IEEE
  • Publisher
    ieee
  • ISSN
    0018-9219
  • Type

    jour

  • DOI
    10.1109/JPROC.2012.2193109
  • Filename
    6210348