Abstract :
Many human actions are correlated, because of compound and/or sequential actions, and similarity. Indeed, human actions are highly correlated in human annotations of 48 actions in the 4,774 videos from visint.org. We exploit such correlations to improve the detection of these 48 human actions, ranging from simple actions such as walk to complex actions such as exchange. We apply a basic pipeline of STIP features, a Random Forest to quantize the features into histograms, and an SVM classifier. First, we show that the sampling for the Random Forest can be improved by exploiting the correlations between human actions. Second, we show that exploiting all 48 actions´ posteriors for detecting a particular action also improves further the detection in general. We demonstrate a 50% relative improvement for human action detection in 1,294 realistic test videos.
Keywords :
feature extraction; gesture recognition; image classification; image sampling; support vector machines; video signal processing; SVM classifier; basic STIP feature pipeline; complex actions; compound actions; feature quantization; human action annotations; human action detection; random forest sampling; realistic test videos; sequential actions; Correlation; Histograms; Humans; Pipelines; Support vector machines; Vectors; Videos;