DocumentCode
714104
Title
Encoding spatio-temporal distribution by generalized VLAD for action recognition
Author
Biyun Sheng ; Yan Yan ; Changyin Sun
Author_Institution
Sch. of Autom., Southeast Univ., Nanjing, China
fYear
2015
fDate
3-6 May 2015
Firstpage
620
Lastpage
625
Abstract
The location information of interest points is an important cue for action recognition. In order to model the spatio-temporal distribution, we propose a novel position feature which is constructed by normalized pairwise relative positions of points. Promising performance has been achieved by Vector of Locally Aggregated Descriptors (VLAD) which gather the differences between descriptors and visual words. However, original VLAD imposes equal weights for difference vectors and ignores zero-order statistics of local descriptors. In this paper, we present Generalized VLAD (GVLAD), an extension of VLAD to encode the position features as well as local appearance descriptors, by which different weights and zero-order information are simultaneously taken into consideration. The state-of-the-art performance on two benchmark datasets validates the effectiveness of our proposed method.
Keywords
image recognition; spatiotemporal phenomena; video coding; GVLAD; action recognition cue; benchmark datasets; difference vectors; generalized VLAD; interest point location information; local appearance descriptors; normalized pairwise relative point position; position feature; position feature encoding; spatio-temporal distribution encoding; spatio-temporal distribution modelling; vector-of-locally aggregated descriptors; visual words; zero-order information; Accuracy; Cameras; Computational modeling; Dictionaries; Encoding; Three-dimensional displays; Visualization;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical and Computer Engineering (CCECE), 2015 IEEE 28th Canadian Conference on
Conference_Location
Halifax, NS
ISSN
0840-7789
Print_ISBN
978-1-4799-5827-6
Type
conf
DOI
10.1109/CCECE.2015.7129346
Filename
7129346
Link To Document