Title :
Encoding spatio-temporal distribution by generalized VLAD for action recognition
Author :
Biyun Sheng ; Yan Yan ; Changyin Sun
Author_Institution :
Sch. of Autom., Southeast Univ., Nanjing, China
Abstract :
The location information of interest points is an important cue for action recognition. In order to model the spatio-temporal distribution, we propose a novel position feature which is constructed by normalized pairwise relative positions of points. Promising performance has been achieved by Vector of Locally Aggregated Descriptors (VLAD) which gather the differences between descriptors and visual words. However, original VLAD imposes equal weights for difference vectors and ignores zero-order statistics of local descriptors. In this paper, we present Generalized VLAD (GVLAD), an extension of VLAD to encode the position features as well as local appearance descriptors, by which different weights and zero-order information are simultaneously taken into consideration. The state-of-the-art performance on two benchmark datasets validates the effectiveness of our proposed method.
Keywords :
image recognition; spatiotemporal phenomena; video coding; GVLAD; action recognition cue; benchmark datasets; difference vectors; generalized VLAD; interest point location information; local appearance descriptors; normalized pairwise relative point position; position feature; position feature encoding; spatio-temporal distribution encoding; spatio-temporal distribution modelling; vector-of-locally aggregated descriptors; visual words; zero-order information; Accuracy; Cameras; Computational modeling; Dictionaries; Encoding; Three-dimensional displays; Visualization;
Conference_Titel :
Electrical and Computer Engineering (CCECE), 2015 IEEE 28th Canadian Conference on
Conference_Location :
Halifax, NS
Print_ISBN :
978-1-4799-5827-6
DOI :
10.1109/CCECE.2015.7129346