DocumentCode :
3423929
Title :
3D Scene Understanding by Voxel-CRF
Author :
Byung-soo Kim ; Kohli, Pushmeet ; Savarese, Silvio
Author_Institution :
Univ. of Michigan, Ann Arbor, MI, USA
fYear :
2013
fDate :
1-8 Dec. 2013
Firstpage :
1425
Lastpage :
1432
Abstract :
Scene understanding is an important yet very challenging problem in computer vision. In the past few years, researchers have taken advantage of the recent diffusion of depth-RGB (RGB-D) cameras to help simplify the problem of inferring scene semantics. However, while the added 3D geometry is certainly useful to segment out objects with different depth values, it also adds complications in that the 3D geometry is often incorrect because of noisy depth measurements and the actual 3D extent of the objects is usually unknown because of occlusions. In this paper we propose a new method that allows us to jointly refine the 3D reconstruction of the scene (raw depth values) while accurately segmenting out the objects or scene elements from the 3D reconstruction. This is achieved by introducing a new model which we called Voxel-CRF. The Voxel-CRF model is based on the idea of constructing a conditional random field over a 3D volume of interest which captures the semantic and 3D geometric relationships among different elements (voxels) of the scene. Such model allows to jointly estimate (1) a dense voxel-based 3D reconstruction and (2) the semantic labels associated with each voxel even in presence of partial occlusions using an approximate yet efficient inference strategy. We evaluated our method on the challenging NYU Depth dataset (Version 1 and 2). Experimental results show that our method achieves competitive accuracy in inferring scene semantics and visually appealing results in improving the quality of the 3D reconstruction. We also demonstrate an interesting application of object removal and scene completion from RGB-D images.
Keywords :
cameras; computer vision; geometry; image colour analysis; image reconstruction; inference mechanisms; random processes; solid modelling; 3D geometric relationships; 3D scene reconstruction; 3D scene understanding; NYU depth dataset; RGB-D images; Voxel-CRF model; computer vision; conditional random field; depth-RGB cameras; inference strategy; noisy depth measurements; object removal; object segmentation; occlusions; scene semantics; semantic labels; voxel-based 3D reconstruction; Cameras; Image reconstruction; Labeling; Noise measurement; Semantics; Solid modeling; Three-dimensional displays; 3D reconstruction; RGB-D; Scene understanding;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision (ICCV), 2013 IEEE International Conference on
Conference_Location :
Sydney, NSW
ISSN :
1550-5499
Type :
conf
DOI :
10.1109/ICCV.2013.180
Filename :
6751287
Link To Document :
بازگشت