DocumentCode :
2714374
Title :
Learning latent temporal structure for complex event detection
Author :
Tang, Kevin ; Fei-Fei, Li ; Koller, Daphne
fYear :
2012
fDate :
16-21 June 2012
Firstpage :
1250
Lastpage :
1257
Abstract :
In this paper, we tackle the problem of understanding the temporal structure of complex events in highly varying videos obtained from the Internet. Towards this goal, we utilize a conditional model trained in a max-margin framework that is able to automatically discover discriminative and interesting segments of video, while simultaneously achieving competitive accuracies on difficult detection and recognition tasks. We introduce latent variables over the frames of a video, and allow our algorithm to discover and assign sequences of states that are most discriminative for the event. Our model is based on the variable-duration hidden Markov model, and models durations of states in addition to the transitions between states. The simplicity of our model allows us to perform fast, exact inference using dynamic programming, which is extremely important when we set our sights on being able to process a very large number of videos quickly and efficiently. We show promising results on the Olympic Sports dataset [16] and the 2011 TRECVID Multimedia Event Detection task [18]. We also illustrate and visualize the semantic understanding capabilities of our model.
Keywords :
Internet; dynamic programming; hidden Markov models; inference mechanisms; object detection; object recognition; video signal processing; 2011 TRECVID Multimedia Event Detection task; Internet; Olympic Sports dataset; complex event detection; conditional model; detection tasks; dynamic programming; exact inference; latent temporal structure learning; latent variables; max-margin framework; recognition tasks; state model durations; variable-duration hidden Markov model; video discriminative segment discovery; video interesting segment discovery; Event detection; Hidden Markov models; Histograms; Internet; Motion segmentation; Vectors; Videos;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on
Conference_Location :
Providence, RI
ISSN :
1063-6919
Print_ISBN :
978-1-4673-1226-4
Electronic_ISBN :
1063-6919
Type :
conf
DOI :
10.1109/CVPR.2012.6247808
Filename :
6247808
Link To Document :
بازگشت