Title :
Action Recognition with Actons
Author :
Jun Zhu ; Baoyuan Wang ; Xiaokang Yang ; Wenjun Zhang ; Zhuowen Tu
Author_Institution :
Inst. of Image Commun. & Network Eng., Shanghai Jiao Tong Univ., Shanghai, China
Abstract :
With the improved accessibility to an exploding amount of video data and growing demands in a wide range of video analysis applications, video-based action recognition/classification becomes an increasingly important task in computer vision. In this paper, we propose a two-layer structure for action recognition to automatically exploit a mid-level ``acton´´ representation. The actons are learned via a new max-margin multi-channel multiple instance learning framework. The learned actons (with no requirement for detailed manual annotations) thus observe a property of being compact, informative, discriminative, and easy to scale. This is different from the standard unsupervised (e.g. k-means) or supervised (e.g. random forests) coding strategies in action recognition. Applying the learned actons in our two-layer structure yields the state-of-the-art classification performance on Youtube and HMDB51 datasets.
Keywords :
gesture recognition; learning (artificial intelligence); video signal processing; computer vision; max margin multichannel multiple instance learning framework; midlevel acton representation; supervised coding strategies; two-layer structure; unsupervised coding strategies; video analysis applications; video-based action classification; video-based action recognition; Computational modeling; Encoding; Feature extraction; Training; Vectors; Videos; Visualization;
Conference_Titel :
Computer Vision (ICCV), 2013 IEEE International Conference on
Conference_Location :
Sydney, NSW
DOI :
10.1109/ICCV.2013.442