Title :
Large-Scale Video Classification with Convolutional Neural Networks
Author :
Karpathy, Andrej ; Toderici, George ; Shetty, Sachin ; Leung, Tommy ; Sukthankar, Rahul ; Li Fei-Fei
Author_Institution :
Google Res., Stanford Univ., Stanford, CA, USA
Abstract :
Convolutional Neural Networks (CNNs) have been established as a powerful class of models for image recognition problems. Encouraged by these results, we provide an extensive empirical evaluation of CNNs on large-scale video classification using a new dataset of 1 million YouTube videos belonging to 487 classes. We study multiple approaches for extending the connectivity of a CNN in time domain to take advantage of local spatio-temporal information and suggest a multiresolution, foveated architecture as a promising way of speeding up the training. Our best spatio-temporal networks display significant performance improvements compared to strong feature-based baselines (55.3% to 63.9%), but only a surprisingly modest improvement compared to single-frame models (59.3% to 60.9%). We further study the generalization performance of our best model by retraining the top layers on the UCF-101 Action Recognition dataset and observe significant performance improvements compared to the UCF-101 baseline model (63.3% up from 43.9%).
Keywords :
image classification; image motion analysis; multimedia computing; neural nets; social networking (online); spatiotemporal phenomena; video signal processing; CNN; UCF-101 action recognition dataset; UCF-101 baseline model; YouTube videos; convolutional neural networks; feature-based baselines; image recognition problems; local spatiotemporal information; spatiotemporal networks; video classification; Computational modeling; Computer architecture; Feature extraction; Spatial resolution; Streaming media; Training; action; classification; convolutional; dataset; large-scale; network; neural; recognition; sports; video;
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on
Conference_Location :
Columbus, OH
DOI :
10.1109/CVPR.2014.223