• DocumentCode
    3672197
  • Title

    Joint action recognition and pose estimation from video

  • Author

    Bruce Xiaohan Nie;Caiming Xiong;Song-Chun Zhu

  • Author_Institution
    Center for Vision, Cognition, Learning and Art, University of California, Los Angeles, USA
  • fYear
    2015
  • fDate
    6/1/2015 12:00:00 AM
  • Firstpage
    1293
  • Lastpage
    1301
  • Abstract
    Action recognition and pose estimation from video are closely related tasks for understanding human motion, most methods, however, learn separate models and combine them sequentially. In this paper, we propose a framework to integrate training and testing of the two tasks. A spatial-temporal And-Or graph model is introduced to represent action at three scales. Specifically the action is decomposed into poses which are further divided to mid-level ST-parts and then parts. The hierarchical structure of our model captures the geometric and appearance variations of pose at each frame and lateral connections between ST-parts at adjacent frames capture the action-specific motion information. The model parameters for three scales are learned discriminatively, and action labels and poses are efficiently inferred by dynamic programming. Experiments demonstrate that our approach achieves state-of-art accuracy in action recognition while also improving pose estimation.
  • Keywords
    "Joints","Feature extraction","Training","Hidden Markov models","Graphical models","Trajectory"
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on
  • Electronic_ISBN
    1063-6919
  • Type

    conf

  • DOI
    10.1109/CVPR.2015.7298734
  • Filename
    7298734