Author_Institution :
Nat. Lab. of Pattern Recognition, Inst. of Autom., Beijing, China
Abstract :
Slow Feature Analysis (SFA) extracts slowly varying features from a quickly varying input signal [1]. It has been successfully applied to modeling the visual receptive fields of the cortical neurons. Sufficient experimental results in neuroscience suggest that the temporal slowness principle is a general learning principle in visual perception. In this paper, we introduce the SFA framework to the problem of human action recognition by incorporating the discriminative information with SFA learning and considering the spatial relationship of body parts. In particular, we consider four kinds of SFA learning strategies, including the original unsupervised SFA (U-SFA), the supervised SFA (S-SFA), the discriminative SFA (D-SFA), and the spatial discriminative SFA (SD--SFA), to extract slow feature functions from a large amount of training cuboids which are obtained by random sampling in motion boundaries. Afterward, to represent action sequences, the squared first order temporal derivatives are accumulated over all transformed cuboids into one feature vector, which is termed the Accumulated Squared Derivative (ASD) feature. The ASD feature encodes the statistical distribution of slow features in an action sequence. Finally, a linear support vector machine (SVM) is trained to classify actions represented by ASD features. We conduct extensive experiments, including two sets of control experiments, two sets of large scale experiments on the KTH and Weizmann databases, and two sets of experiments on the CASIA and UT-interaction databases, to demonstrate the effectiveness of SFA for human action recognition. Experimental results suggest that the SFA-based approach (1) is able to extract useful motion patterns and improves the recognition performance, (2) requires less intermediate processing steps but achieves comparable or even better performance, and (3) has good potential to recognize complex multiperson activities.
Keywords :
feature extraction; image coding; image motion analysis; image recognition; learning (artificial intelligence); statistical distributions; support vector machines; visual databases; ASD feature encoding; UT-interaction database; Weizmann database; accumulated squared derivative feature vector; action sequence; complex multiperson activity recognition; cortical neuron; feature function extraction; human action recognition performance; learning principle; linear support vector machine; motion boundary; motion pattern; original unsupervised SFA learning strategy; slowly varying feature analysis; spatial discriminative SFA; spatial relationship; statistical distribution; supervised SFA-based approach; temporal slowness principle; visual perception; visual receptive field; Feature extraction; Humans; Neurons; Pattern recognition; Spatiotemporal phenomena; Vectors; Visualization; Human action recognition; slow feature analysis.; Algorithms; Humans; Support Vector Machines; Visual Perception;