• DocumentCode
    4232
  • Title

    Learning-Based, Automatic 2D-to-3D Image and Video Conversion

  • Author

    Konrad, Janusz ; Meng Wang ; Ishwar, Prakash ; Chen Wu ; Mukherjee, Dipankar

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Boston Univ., Boston, MA, USA
  • Volume
    22
  • Issue
    9
  • fYear
    2013
  • fDate
    Sept. 2013
  • Firstpage
    3485
  • Lastpage
    3496
  • Abstract
    Despite a significant growth in the last few years, the availability of 3D content is still dwarfed by that of its 2D counterpart. To close this gap, many 2D-to-3D image and video conversion methods have been proposed. Methods involving human operators have been most successful but also time-consuming and costly. Automatic methods, which typically make use of a deterministic 3D scene model, have not yet achieved the same level of quality for they rely on assumptions that are often violated in practice. In this paper, we propose a new class of methods that are based on the radically different approach of learning the 2D-to-3D conversion from examples. We develop two types of methods. The first is based on learning a point mapping from local image/video attributes, such as color, spatial position, and, in the case of video, motion at each pixel, to scene-depth at that pixel using a regression type idea. The second method is based on globally estimating the entire depth map of a query image directly from a repository of 3D images ( image+depth pairs or stereopairs) using a nearest-neighbor regression type idea. We demonstrate both the efficacy and the computational efficiency of our methods on numerous 2D images and discuss their drawbacks and benefits. Although far from perfect, our results demonstrate that repositories of 3D content can be used for effective 2D-to-3D image conversion. An extension to video is immediate by enforcing temporal continuity of computed depth maps.
  • Keywords
    learning (artificial intelligence); regression analysis; stereo image processing; video retrieval; video signal processing; 3D content; depth map; deterministic 3D scene model; human operators; learning-based automatic 2D-to-3D image conversion; local image-video attributes; nearest-neighbor regression type; point mapping; query image; temporal continuity; video conversion; 3D images; cross-bilateral filtering; image conversion; nearest neighbor classification; stereoscopic images;
  • fLanguage
    English
  • Journal_Title
    Image Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1057-7149
  • Type

    jour

  • DOI
    10.1109/TIP.2013.2270375
  • Filename
    6544689