• DocumentCode
    2591561
  • Title

    Object categorization by learned universal visual dictionary

  • Author

    Winn, J. ; Criminisi, A. ; Minka, T.

  • Author_Institution
    Microsoft Res., Cambridge
  • Volume
    2
  • fYear
    2005
  • fDate
    17-21 Oct. 2005
  • Firstpage
    1800
  • Abstract
    This paper presents a new algorithm for the automatic recognition of object classes from images (categorization). Compact and yet discriminative appearance-based object class models are automatically learned from a set of training images. The method is simple and extremely fast, making it suitable for many applications such as semantic image retrieval, Web search, and interactive image editing. It classifies a region according to the proportions of different visual words (clusters in feature space). The specific visual words and the typical proportions in each object are learned from a segmented training set. The main contribution of this paper is twofold: i) an optimally compact visual dictionary is learned by pair-wise merging of visual words from an initially large dictionary. The final visual words are described by GMMs. ii) A novel statistical measure of discrimination is proposed which is optimized by each merge operation. High classification accuracy is demonstrated for nine object classes on photographs of real objects viewed under general lighting conditions, poses and viewpoints. The set of test images used for validation comprise: i) photographs acquired by us, ii) images from the Web and iii) images from the recently released Pascal dataset. The proposed algorithm performs well on both texture-rich objects (e.g. grass, sky, trees) and structure-rich ones (e.g. cars, bikes, planes)
  • Keywords
    image classification; learning (artificial intelligence); object recognition; discriminative appearance-based object class models; image categorization; object categorization; object class recognition; universal visual dictionary; visual words; Bicycles; Deformable models; Dictionaries; Image recognition; Image retrieval; Image segmentation; Lighting; Merging; Testing; Web search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision, 2005. ICCV 2005. Tenth IEEE International Conference on
  • Conference_Location
    Beijing
  • ISSN
    1550-5499
  • Print_ISBN
    0-7695-2334-X
  • Type

    conf

  • DOI
    10.1109/ICCV.2005.171
  • Filename
    1544935