Visual co-occurrence network: using context for large-scale object recognition in retail

Author

Siddharth Advani;Brigid Smith;Yasuki Tanabe;Kevin Irick;Matthew Cotter;Jack Sampson;Vijaykrishnan Narayanan

Author_Institution

The Pennsylvania State University, USA

fYear

2015

fDate

10/1/2015 12:00:00 AM

Firstpage

1

Lastpage

10

Abstract

In any visual object recognition system, the classification accuracy will likely determine the usefulness of the system as a whole. In many real-world applications, it is also important to be able to recognize a large number of diverse objects for the system to be robust enough to handle the sort of tasks that the human visual system handles on an average day. These objectives are often at odds with performance, as running too large of a number of detectors on any one scene will be prohibitively slow for use in any real-time scenario. However, visual information has temporal and spatial context that can be exploited to reduce the number of detectors that need to be triggered at any given instance. In this paper, we propose a dynamic approach to encode such context, called Visual Co-occurrence Network (ViCoNet) that establishes relationships between objects observed in a visual scene. We investigate the utility of ViCoNet when integrated into a vision pipeline targeted for retail shopping. When evaluated on a large and deep dataset, we achieve a 50% improvement in performance and a 7% improvement in accuracy in the best case, and a 45% improvement in performance and a 3% improvement in accuracy in the average case over an established baseline. The memory overhead of ViCoNet is around 10KB, highlighting its effectiveness on temporal big data.

Keywords

"Context","Visualization","Object recognition","Feature extraction","Real-time systems","Image recognition","Image segmentation"

Publisher

ieee

Conference_Titel

Embedded Systems For Real-time Multimedia (ESTIMedia), 2015 13th IEEE Symposium on

Type

conf

DOI

10.1109/ESTIMedia.2015.7351774

Filename

7351774