Author :
Baroffio, Luca ; Canclini, Antonio ; Cesana, Matteo ; Redondi, Alessandro ; Tagliasacchi, Marco ; Tubaro, Stefano
Author_Institution :
Dipt. di Elettron., Inf. e Bioingegneria, Politec. di Milano, Milan, Italy
Abstract :
Binary local features represent an effective alternative to real-valued descriptors, leading to comparable results for many visual analysis tasks while being characterized by significantly lower computational complexity and memory requirements. When dealing with large collections, a more compact representation based on global features is often preferred, which can be obtained from local features by means of, e.g., the bag-of-visual word model. Several applications, including, for example, visual sensor networks and mobile augmented reality, require visual features to be transmitted over a bandwidth-limited network, thus calling for coding techniques that aim at reducing the required bit budget while attaining a target level of efficiency. In this paper, we investigate a coding scheme tailored to both local and global binary features, which aims at exploiting both spatial and temporal redundancy by means of intra- and inter-frame coding. In this respect, the proposed coding scheme can conveniently be adopted to support the analyze-then-compress (ATC) paradigm. That is, visual features are extracted from the acquired content, encoded at remote nodes, and finally transmitted to a central controller that performs the visual analysis. This is in contrast with the traditional approach, in which visual content is acquired at a node, compressed and then sent to a central unit for further processing, according to the compress-then-analyze (CTA) paradigm. In this paper, we experimentally compare the ATC and the CTA by means of rate-efficiency curves in the context of two different visual analysis tasks: 1) homography estimation and 2) content-based retrieval. Our results show that the novel ATC paradigm based on the proposed coding primitives can be competitive with the CTA, especially in bandwidth limited scenarios.
Keywords :
content-based retrieval; feature extraction; image sequences; video coding; analyze-then-compress paradigm; compress-then-analyze paradigm; content-based retrieval; global binary visual features extraction; homography estimation; inter-frame coding scheme; intra-frame coding scheme; local binary visual features extraction; rate-efficiency curves; video sequences; visual analysis; Encoding; Feature extraction; Image coding; Mobile communication; Training; Video sequences; Visualization; BRISK; Bag-of- Words; Bag-of-Words; Visual features; binary descriptors; video coding;