Title :
A Multi-modal Graphical Model for Scene Analysis
Author :
Namin, Sarah Taghavi ; Najafi, Mohammad ; Salzmann, Mathieu ; Petersson, Lars
Abstract :
In this paper, we introduce a multi-modal graphical model to address the problems of semantic segmentation using 2D-3D data exhibiting extensive many-to-one correspondences. Existing methods often impose a hard correspondence between the 2D and 3D data, where the 2D and 3D corresponding regions are forced to receive identical labels. This results in performance degradation due to misalignments, 3D-2D projection errors and occlusions. We address this issue by defining a graph over the entire set of data that models soft correspondences between the two modalities. This graph encourages each region in a modality to leverage the information from its corresponding regions in the other modality to better estimate its class label. We evaluate our method on a publicly available dataset and beat the state-of-the-art. Additionally, to demonstrate the ability of our model to support multiple correspondences for objects in 3D and 2D domains, we introduce a new multi-modal dataset, which is composed of panoramic images and LIDAR data, and features a rich set of many-to-one correspondences.
Keywords :
graph theory; image classification; image segmentation; natural scenes; 2D domains; 2D-3D data; 2D-3D occlusions; 2D-3D projection errors; 3D domains; LIDAR data; class label estimation; identical labels; information leveraging; many-to-one correspondences; modalities; multimodal dataset; multimodal graphical model; panoramic images; performance degradation; publicly available dataset; scene analysis; semantic segmentation problems; soft correspondences; Graphical models; Image segmentation; Labeling; Laser radar; Semantics; Three-dimensional displays; Vectors;
Conference_Titel :
Applications of Computer Vision (WACV), 2015 IEEE Winter Conference on
Conference_Location :
Waikoloa, HI
DOI :
10.1109/WACV.2015.139