• DocumentCode
    3406616
  • Title

    Saliency-based selection of sparse descriptors for action recognition

  • Author

    Vig, Eleonora ; Dorr, Michael ; Cox, David D.

  • Author_Institution
    Rowland Inst. at Harvard, Cambridge, MA, USA
  • fYear
    2012
  • fDate
    Sept. 30 2012-Oct. 3 2012
  • Firstpage
    1405
  • Lastpage
    1408
  • Abstract
    Local spatiotemporal descriptors are being successfully used as a powerful video representation for action recognition. Particularly competitive recognition performance is achieved when these descriptors are densely sampled on a regular grid; in contrast to existing approaches that are based on features at interest points, dense sampling captures more contextual information, albeit at high computational cost. We here combine advantages of both dense and sparse sampling. Once descriptors are extracted on a dense grid, we prune them either randomly or based on a sparse saliency mask of the underlying video. The method is evaluated using two state-of-the-art algorithms on the challenging Hollywood2 benchmark. Classification performance is maintained with as little as 30% of descriptors, while more modest saliency-based pruning of descriptors yields improved performance. With roughly 80% of descriptors of the Dense Trajectories model, we outperform all previously reported methods, obtaining a mean average precision of 59.5%.
  • Keywords
    feature extraction; image classification; image representation; image sampling; object recognition; video signal processing; Hollywood2 benchmark; action recognition; classification performance; contextual information; dense sampling; dense trajectories model; descriptor extraction; descriptor saliency-based pruning; local spatiotemporal descriptors; saliency-based selection; sparse descriptors; sparse saliency mask; sparse sampling; video representation; Computational modeling; Feature extraction; Histograms; Humans; Spatiotemporal phenomena; Trajectory; Visualization; Action Recognition; Saliency Maps; Space-time Image Descriptors; Sparse Representations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Image Processing (ICIP), 2012 19th IEEE International Conference on
  • Conference_Location
    Orlando, FL
  • ISSN
    1522-4880
  • Print_ISBN
    978-1-4673-2534-9
  • Electronic_ISBN
    1522-4880
  • Type

    conf

  • DOI
    10.1109/ICIP.2012.6467132
  • Filename
    6467132