DocumentCode :
32117
Title :
Heterogeneous Visual Codebook Integration Via Consensus Clustering for Visual Categorization
Author :
Lopez-Sastre, Roberto J. ; Renes-Olalla, Javier ; Gil-Jimenez, Pedro ; Maldonado-Bascon, Saturnino ; Lafuente-Arroyo, Sergio
Author_Institution :
Dept. of Signal Theor. & Commun., Univ. of Alcala, Alcalá de Henares, Spain
Volume :
23
Issue :
8
fYear :
2013
fDate :
Aug. 2013
Firstpage :
1358
Lastpage :
1368
Abstract :
Most recent category-level object and activity recognition systems work with visual words, i.e., vector-quantized local descriptors. These visual vocabularies are usually built by using a local feature, such as SIFT, and a single clustering algorithm, such as K-means. However, very different clusterings algorithms are at our disposal, each of them discovering different structures in the data. In this paper, we explore how to combine these heterogeneous codebooks and introduce a novel approach for their integration via consensus clustering. Considering each visual vocabulary as one modal, we propose the visual word aggregation (VWA) methodology, to learn a common codebook, where the stability of the visual vocabulary construction process is increased, the size of the codebook is determined in an unsupervised integration, and more discriminative representations are obtained. With the aim of obtaining contextual visual words, we also incorporate the spatial neighboring relation between the local descriptors into the VWA process: the contextual-VWA approach. We integrate over-segmentation algorithms and spatial grids into the aggregation process to obtain a visual vocabulary that narrows the semantic gap between visual words and visual concepts. We show how the proposed codebooks perform in recognizing objects and scenes on very challenging datasets. Compared with unimodal visual codebook construction approaches, our multimodal approach always achieves superior performances.
Keywords :
image representation; object recognition; pattern classification; VWA methodology; activity recognition systems; clustering algorithm; consensus clustering; contextual visual words; discriminative representations; heterogeneous codebooks; object recognition systems; oversegmentation algorithms; spatial grids; spatial neighboring relation; unsupervised integration; visual vocabulary construction process; visual word aggregation; Clustering algorithms; Histograms; Semantics; Vector quantization; Vectors; Visualization; Vocabulary; Clustering aggregation; consensus clustering; object recognition; scene recognition; visual words;
fLanguage :
English
Journal_Title :
Circuits and Systems for Video Technology, IEEE Transactions on
Publisher :
ieee
ISSN :
1051-8215
Type :
jour
DOI :
10.1109/TCSVT.2013.2243058
Filename :
6422366
Link To Document :
بازگشت