Title :
Modeling human activities as speech
Author :
Chen, Chia-Chih ; Aggarwal, J.K.
Author_Institution :
Dept. of ECE, Univ. of Texas at Austin, Austin, TX, USA
Abstract :
Human activity recognition and speech recognition appear to be two loosely related research areas. However, on a careful thought, there are several analogies between activity and speech signals with regard to the way they are generated, propagated, and perceived. In this paper, we propose a novel action representation, the action spectrogram, which is inspired by a common spectrographic representation of speech. Different from sound spectrogram, an action spectrogram is a space-time-frequency representation which characterizes the short-time spectral properties of body parts´ movements. While the essence of the speech signal is the variation of air pressure in time, our method models activities as the likelihood time series of action associated local interest patterns. This low-level process is realized by learning boosted window classifiers from spatially quantized spatio-temporal interest features. We have tested our algorithm on a variety of human activity datasets and achieved superior results.
Keywords :
gesture recognition; signal classification; spectral analysis; speech recognition; time series; action representation; action spectrogram; body parts movements; boosted window classifiers; human activities modeling; human activity datasets; human activity recognition; likelihood time series; local interest patterns; short-time spectral property; sound spectrogram; space-time-frequency representation; spatially quantized spatio-temporal interest features; spectrographic speech representation; speech recognition; speech signals; Feature extraction; Humans; Speech; Speech recognition; Time series analysis; Training; Videos;
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on
Conference_Location :
Providence, RI
Print_ISBN :
978-1-4577-0394-2
DOI :
10.1109/CVPR.2011.5995555