• DocumentCode
    3428433
  • Title

    Concurrent Action Detection with Structural Prediction

  • Author

    Ping Wei ; Nanning Zheng ; Yibiao Zhao ; Song-Chun Zhu

  • Author_Institution
    Xi´an Jiaotong Univ., Xi´an, China
  • fYear
    2013
  • fDate
    1-8 Dec. 2013
  • Firstpage
    3136
  • Lastpage
    3143
  • Abstract
    Action recognition has often been posed as a classification problem, which assumes that a video sequence only have one action class label and different actions are independent. However, a single human body can perform multiple concurrent actions at the same time, and different actions interact with each other. This paper proposes a concurrent action detection model where the action detection is formulated as a structural prediction problem. In this model, an interval in a video sequence can be described by multiple action labels. An detected action interval is determined both by the unary local detector and the relations with other actions. We use a wavelet feature to represent the action sequence, and design a composite temporal logic descriptor to describe the action relations. The model parameters are trained by structural SVM learning. Given a long video sequence, a sequential decision window search algorithm is designed to detect the actions. Experiments on our new collected concurrent action dataset demonstrate the strength of our method.
  • Keywords
    image classification; image recognition; image sequences; learning (artificial intelligence); object detection; search problems; video signal processing; wavelet transforms; action interval determination; action labels; action recognition; action sequence representation; classification problem; composite temporal logic descriptor; concurrent action dataset; concurrent action detection model; model parameter training; sequential decision window search algorithm; structural SVM learning; structural prediction; structural prediction problem; unary local detector; video sequence; wavelet feature; Detectors; Joints; Keyboards; Three-dimensional displays; Vectors; Video sequences; Wavelet transforms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Vision (ICCV), 2013 IEEE International Conference on
  • Conference_Location
    Sydney, NSW
  • ISSN
    1550-5499
  • Type

    conf

  • DOI
    10.1109/ICCV.2013.389
  • Filename
    6751501