Learning-Based, Automatic 2D-to-3D Image and Video Conversion

Author

Konrad, Janusz ; Meng Wang ; Ishwar, Prakash ; Chen Wu ; Mukherjee, Dipankar

Author_Institution

Dept. of Electr. & Comput. Eng., Boston Univ., Boston, MA, USA

Volume

22

Issue

9

fYear

2013

fDate

Sept. 2013

Firstpage

3485

Lastpage

3496

Abstract

Despite a significant growth in the last few years, the availability of 3D content is still dwarfed by that of its 2D counterpart. To close this gap, many 2D-to-3D image and video conversion methods have been proposed. Methods involving human operators have been most successful but also time-consuming and costly. Automatic methods, which typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality for they rely on assumptions that are often violated in practice. In this paper, we propose a new class of methods that are based on the radically different approach of learning the 2D-to-3D conversion from examples. We develop two types of methods. The first is based on learning a point mapping from local image/video attributes, such as color, spatial position, and, in the case of video, motion at each pixel, to scene-depth at that pixel using a regression type idea. The second method is based on globally estimating the entire depth map of a query image directly from a repository of 3D images ( image+depth pairs or stereopairs) using a nearest-neighbor regression type idea. We demonstrate both the efficacy and the computational efficiency of our methods on numerous 2D images and discuss their drawbacks and benefits. Although far from perfect, our results demonstrate that repositories of 3D content can be used for effective 2D-to-3D image conversion. An extension to video is immediate by enforcing temporal continuity of computed depth maps.

Keywords

learning (artificial intelligence); regression analysis; stereo image processing; video retrieval; video signal processing; 3D content; depth map; deterministic 3D scene model; human operators; learning-based automatic 2D-to-3D image conversion; local image-video attributes; nearest-neighbor regression type; point mapping; query image; temporal continuity; video conversion; 3D images; cross-bilateral filtering; image conversion; nearest neighbor classification; stereoscopic images;

fLanguage

English

Journal_Title

Image Processing, IEEE Transactions on

Publisher

ieee

ISSN

1057-7149

Type

jour

DOI

10.1109/TIP.2013.2270375

Filename

6544689