Title :
Hierarchical Filtered Motion for Action Recognition in Crowded Videos
Author :
Tian, YingLi ; Cao, Liangliang ; Liu, Zicheng ; Zhang, Zhengyou
Author_Institution :
IBM T.J. Watson Res. Center, Yorktown Heights, NY, USA
fDate :
5/1/2012 12:00:00 AM
Abstract :
Action recognition with cluttered and moving background is a challenging problem. One main difficulty lies in the fact that the motion field in an action region is contaminated by the background motions. We propose a hierarchical filtered motion (HFM) method to recognize actions in crowded videos by the use of motion history image (MHI) as basic representations of motion because of its robustness and efficiency. First, we detect interest points as the two-dimensional Harris corners with recent motion, e.g., locations with high intensities in the MHI. Then, a global spatial motion smoothing filter is applied to the gradients of the MHI to eliminate isolated unreliable or noisy motions. At each interest point, a local motion field filter is applied to the smoothed gradients of the MHI by computing structure proximity between any pixel in the local region and the interest point. Thus, the motion at a pixel is enhanced or weakened based on its structure proximity with the interest point. To validate its effectiveness, we characterize the spatial and temporal features by histograms of oriented gradient in the intensity image and the MHI, respectively, and use a Gaussian-mixture-model-based classifier for action recognition. The performance of the proposed approach achieves the state-of-the-art results on the KTH dataset that has clean background. More importantly, we perform cross-dataset action classification and detection experiments, where the KTH dataset is used for training, while the microsoft research (MSR) action dataset II that consists of crowded videos with people moving in the background is used for testing. Our experiments show that the proposed HFM method significantly outperforms existing techniques.
Keywords :
Gaussian processes; filtering theory; image classification; image motion analysis; object detection; object recognition; video signal processing; Gaussian-mixture-model-based classifier; HFM method; KTH dataset; Microsoft research action dataset II; action recognition; background motions; cluttered background; cross-dataset action classification; cross-dataset action detection; crowded videos; dimensional Harris corners; global spatial motion smoothing filter; hierarchical filtered motion method; histograms of oriented gradient; intensity image; interest point detection; local motion field filter; motion history image; motion representation; moving background; spatial features; structure proximity computation; temporal features; Feature extraction; Histograms; History; Lighting; Pixel; Robustness; Videos; Action classification; action detection; crowded videos; hierarchical filtered motion (HFM); motion history image (MHI);
Journal_Title :
Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on
DOI :
10.1109/TSMCC.2011.2149519