• DocumentCode
    3673896
  • Title

    From generic to specific deep representations for visual recognition

  • Author

    Hossein Azizpour;Ali Sharif Razavian;Josephine Sullivan;Atsuto Maki;Stefan Carlsson

  • Author_Institution
    KTH (Royal Institute of Technology), 114 28 Stockholm, Sweden
  • fYear
    2015
  • fDate
    6/1/2015 12:00:00 AM
  • Firstpage
    36
  • Lastpage
    45
  • Abstract
    Evidence is mounting that ConvNets are the best representation learning method for recognition. In the common scenario, a ConvNet is trained on a large labeled dataset and the feed-forward units activation, at a certain layer of the network, is used as a generic representation of an input image. Recent studies have shown this form of representation to be astoundingly effective for a wide range of recognition tasks. This paper thoroughly investigates the transferability of such representations w.r.t. several factors. It includes parameters for training the network such as its architecture and parameters of feature extraction. We further show that different visual recognition tasks can be categorically ordered based on their distance from the source task. We then show interesting results indicating a clear correlation between the performance of tasks and their distance from the source task conditioned on proposed factors. Furthermore, by optimizing these factors, we achieve state-of-the-art performances on 16 visual recognition tasks.
  • Keywords
    "Visualization","Sun","Training","Positron emission tomography","Computer vision","Standards","Image recognition"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition Workshops (CVPRW), 2015 IEEE Conference on
  • Electronic_ISBN
    2160-7516
  • Type

    conf

  • DOI
    10.1109/CVPRW.2015.7301270
  • Filename
    7301270