• DocumentCode
    598809
  • Title

    Retina-enhanced SURF descriptors for semantic concept detection in videos

  • Author

    Strat, Sabin Tiberius ; Benoit, A. ; Lambert, Peter ; Caplier, A.

  • Author_Institution
    LISTIC, Univ. de Savoie, Annecy Le Vieux, France
  • fYear
    2012
  • fDate
    15-18 Oct. 2012
  • Firstpage
    319
  • Lastpage
    324
  • Abstract
    This paper proposes to investigate the potential benefit of the use of low-level human vision behaviors in the context of high-level semantic concept detection. A large part of the current approaches relies on the Bag-of-Words (BoW) model, which has proven itself to be a good choice especially for object recognition in images. Its extension from static images to video sequences exhibits some new problems to cope with, mainly the way to use the added temporal dimension for detecting the target concepts (swimming, drinking...). In this study, we propose to apply a human retina model to preprocess video sequences, before constructing a State-Of-The-Art BoW analysis. This preprocessing, designed in a way that enhances the appearance especially of static image elements, increases the performance by introducing robustness to traditional image and video problems, such as luminance variation, shadows, compression artifacts and noise. These approaches are evaluated on the TrecVid 2010 Semantic Indexing task datasets, containing 130 high-level semantic concepts. We consider the well-known SURF descriptor as the entry point of the BoW system, but this work could be extended to any other local gradient based descriptor.
  • Keywords
    computer vision; data compression; eye; gradient methods; image sequences; object detection; object recognition; video signal processing; BoW model; TrecVid 2010 semantic indexing task dataset; bag-of-words model; compression artifact; high-level semantic concept detection; human retina model; image problem; local gradient based descriptor; low-level human vision behavior; luminance variation; noise; object recognition; retina-enhanced SURF descriptor; shadow; static image; temporal dimension; video problem; video sequence; Feature extraction; Noise; Retina; Semantics; Videos; Visualization; Vocabulary; Bag of words; Retina analysis; Retina preprocessing; SURF; Semantics; Video content; Video indexation;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing Theory, Tools and Applications (IPTA), 2012 3rd International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    2154-5111
  • Print_ISBN
    978-1-4673-2585-1
  • Type

    conf

  • DOI
    10.1109/IPTA.2012.6469557
  • Filename
    6469557