DocumentCode :
629089
Title :
Retina enhanced SIFT descriptors for video indexing
Author :
Strat, Sabin Tiberius ; Benoit, A. ; Lambert, Peter
Author_Institution :
LISTIC, Univ. de Savoie Annecy Le Vieux, Annecy, France
fYear :
2013
fDate :
17-19 June 2013
Firstpage :
201
Lastpage :
206
Abstract :
This paper investigates how the detection of diverse high-level semantic concepts (objects, actions, scene types, persons etc.) in videos can be improved by applying a model of the human retina. A large part of the current approaches for Content-Based Image/Video Retrieval (CBIR/CBVR) relies on the Bag-of-Words (BoW) model, which has shown to perform well especially for object recognition in static images. Nevertheless, the current state-of-the-art framework shows its limits when applied to videos because of the added temporal information. In this paper, we enhance a BoW model based on the classical SIFT local spatial descriptor, by preprocessing videos with a model of the human retina. This retinal preprocessing allows the SIFT descriptor to become aware of temporal information. Our proposed descriptors extend the SIFT genericity to spatio-temporal content, making them interesting for generic video indexing. They also benefit from the retinal spatio-temporal “robustness” to various disturbances such as noise, compression artifacts, luminance variations or shadows. The proposed approaches are evaluated on the TRECVID 2012 Semantic Indexing task dataset.
Keywords :
content-based retrieval; indexing; transforms; video retrieval; video signal processing; BoW model; CBIR-CBVR; SIFT genericity; TRECVID 2012 Semantic Indexing task dataset; bag-of-words model; classical SIFT local spatial descriptor; compression artifacts; content-based image retrieval; content-based video retrieval; generic video indexing; high-level semantic concept detection; human retina model; luminance variations; noise; retina enhanced SIFT descriptors; retinal preprocessing; retinal spatio-temporal robustness; shadows; spatio-temporal content; temporal information; video preprocessing; Feature extraction; Indexing; Noise; Retina; Semantics; Transient analysis; Visualization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Content-Based Multimedia Indexing (CBMI), 2013 11th International Workshop on
Conference_Location :
Veszprem
ISSN :
1949-3983
Print_ISBN :
978-1-4799-0955-1
Type :
conf
DOI :
10.1109/CBMI.2013.6576582
Filename :
6576582
Link To Document :
بازگشت