• DocumentCode
    1475456
  • Title

    Localized Multiple Kernel Learning for Realistic Human Action Recognition in Videos

  • Author

    Song, Yan ; Zheng, Yan-Tao ; Tang, Sheng ; Zhou, Xiangdong ; Zhang, Yongdong ; Lin, Shouxun ; Chua, Tat-Seng

  • Author_Institution
    Lab. of Adv. Comput. Res., Chinese Acad. of Sci., Beijing, China
  • Volume
    21
  • Issue
    9
  • fYear
    2011
  • Firstpage
    1193
  • Lastpage
    1202
  • Abstract
    Realistic human action recognition in videos has been a useful yet challenging task. Video shots of same actions may present huge intra-class variations in terms of visual appearance, kinetic patterns, video shooting, and editing styles. Heterogeneous feature representations of videos pose another challenge on how to effectively handle the redundancy, complementariness and disagreement in these features. This paper proposes a localized multiple kernel learning (L-MKL) algorithm to tackle the issues above. L-MKL integrates the localized classifier ensemble learning and multiple kernel learning in a unified framework to leverage the strengths of both. The basis of L-MKL is to build multiple kernel classifiers on diverse features at subspace localities of heterogeneous representations. L-MKL integrates the discriminability of complementary features locally and enables localized MKL classifiers to deliver better performance in its own region of expertise. Specifically, L-MKL develops a locality gating model to partition the input space of heterogeneous representations to a set of localities of simpler data structure. Each locality then learns its localized optimal combination of Mercer kernels of heterogeneous features. Finally, the gating model coordinates the localized multiple kernel classifiers globally to perform action recognition. Experiments on two datasets show that the proposed approach delivers promising performance.
  • Keywords
    image recognition; image representation; learning (artificial intelligence); video signal processing; heterogeneous feature representations; intraclass variations; kinetic patterns; localized classifier ensemble learning; localized multiple kernel learning; realistic human action recognition; video shots; visual appearance; Classification algorithms; Computational modeling; Humans; Kernel; Support vector machines; Training; Videos; Action recognition; localized classifier; multiple kernel learning;
  • fLanguage
    English
  • Journal_Title
    Circuits and Systems for Video Technology, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1051-8215
  • Type

    jour

  • DOI
    10.1109/TCSVT.2011.2130230
  • Filename
    5734823