• DocumentCode
    3004415
  • Title

    What is the spatial extent of an object?

  • Author

    Uijlings, J.R.R. ; Smeulders, Arnold W. M. ; Scha, R.J.H.

  • Author_Institution
    Intell. Syst. Lab., Univ. of Amsterdam, Amsterdam, Netherlands
  • fYear
    2009
  • fDate
    20-25 June 2009
  • Firstpage
    770
  • Lastpage
    777
  • Abstract
    This paper discusses the question: Can we improve the recognition of objects by using their spatial context? We start from Bag-of-Words models and use the Pascal 2007 dataset. We use the rough object bounding boxes that come with this dataset to investigate the fundamental gain context can bring. Our main contributions are: (I) The result of Zhang et al. in CVPR07 that context is superfluous derived from the Pascal 2005 data set of 4 classes does not generalize to this dataset. For our larger and more realistic dataset context is important indeed. (II) Using the rough bounding box to limit or extend the scope of an object during both training and testing, we find that the spatial extent of an object is determined by its category: (a) well-defined, rigid objects have the object itself as the preferred spatial extent. (b) Non-rigid objects have an unbounded spatial extent : all spatial extents produce equally good results. (c) Objects primarily categorised based on their function have the whole image as their spatial extent. Finally, (III) using the rough bounding box to treat object and context separately, we find that the upper bound of improvement is 26% (12% absolute) in terms of mean average precision, and this bound is likely to be higher if the localisation is done using segmentation. It is concluded that object localisation, if done sufficiently precise, helps considerably in the recognition of objects for the Pascal 2007 dataset.
  • Keywords
    Pascal; image segmentation; object recognition; video retrieval; Bag-of-Words models; Pascal 2007 dataset; image segmentation; image-video retrieval; mean average precision; object localisation; object recognition; rough object bounding boxes; Frequency conversion; Histograms; Image retrieval; Informatics; Intelligent systems; Kernel; Logic; Sampling methods; Support vector machine classification; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on
  • Conference_Location
    Miami, FL
  • ISSN
    1063-6919
  • Print_ISBN
    978-1-4244-3992-8
  • Type

    conf

  • DOI
    10.1109/CVPR.2009.5206663
  • Filename
    5206663