• DocumentCode
    3673900
  • Title

    Multi-scale pyramid pooling for deep convolutional representation

  • Author

    Donggeun Yoo;Sunggyun Park;Joon-Young Lee; In So Kweon

  • Author_Institution
    KAIST, Daejeon, 305-701, Korea
  • fYear
    2015
  • fDate
    6/1/2015 12:00:00 AM
  • Firstpage
    71
  • Lastpage
    80
  • Abstract
    Compared to image representation based on low-level local descriptors, deep neural activations of Convolutional Neural Networks (CNNs) are richer in mid-level representation, but poorer in geometric invariance properties. In this paper, we present a straightforward framework for better image representation by combining the two approaches. To take advantages of both representations, we extract a fair amount of multi-scale dense local activations from a pre-trained CNN. We then aggregate the activations by Fisher kernel framework, which has been modified with a simple scale-wise normalization essential to make it suitable for CNN activations. Our representation demonstrates new state-of-the-art performances on three public datasets: 80.78% (Acc.) on MIT Indoor 67, 83.20% (mAP) on PASCAL VOC 2007 and 91.28% (Acc.) on Oxford 102 Flowers. The results suggest that our proposal can be used as a primary image representation for better performances in wide visual recognition tasks.
  • Keywords
    "Kernel","Image representation","Visualization","Aggregates","Image recognition","Accuracy","Support vector machines"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition Workshops (CVPRW), 2015 IEEE Conference on
  • Electronic_ISBN
    2160-7516
  • Type

    conf

  • DOI
    10.1109/CVPRW.2015.7301274
  • Filename
    7301274