• DocumentCode
    3605386
  • Title

    RGB-D Object Recognition via Incorporating Latent Data Structure and Prior Knowledge

  • Author

    Jinhui Tang ; Lu Jin ; Zechao Li ; Shenghua Gao

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Nanjing Univ. of Sci. & Technol., Nanjing, China
  • Volume
    17
  • Issue
    11
  • fYear
    2015
  • Firstpage
    1899
  • Lastpage
    1908
  • Abstract
    For the task of RGB-D object recognition, it is important to identify suitable representations of images, which can boost the performance of object recognition. In this work, we propose a novel representation learning method for RGB-D images by jointly incorporating the underlying data structure and the prior knowledge of the data. Specifically, the convolutional neural networks (CNN) are employed to learn image representation by exploiting the underlying data structure. To handle the problem of the limited RGB and depth images for object recognition, the multi-level hierarchies of features trained on ImageNet from the CNN are transferred to learn rich generic feature representation for RGB and depth images while the labeled images are leveraged. On the other hand, we propose a novel deep auto-encoders (DAE) to exploit the prior knowledge, which can overcome the expensive computational cost of optimization in feature encoding. The expected representations of images are obtained by integrating the two types of image representations. To verify the effectiveness of the proposed method, we thoroughly conduct extensive experiments on two publicly available RGB-D datasets. The encouraging experimental results compared with the state-of-the-art approaches demonstrate the advantages of the proposed method.
  • Keywords
    data structures; image representation; learning (artificial intelligence); object recognition; CNN; DAE; ImageNet; RGB-D object recognition; convolutional neural networks; deep auto-encoders; depth images; image representation; latent data structure; learning method; multilevel hierarchies; Data structures; Encoding; Feature extraction; Image coding; Image representation; Object recognition; Visualization; Deep learning; RGB-D object recognition; transfer learning;
  • fLanguage
    English
  • Journal_Title
    Multimedia, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1520-9210
  • Type

    jour

  • DOI
    10.1109/TMM.2015.2476660
  • Filename
    7239585