Multi-view descriptor mining via codeword net for action recognition

Author

Jingyu Liu;Yongzhen Huang;Xiaojiang Peng;Liang Wang

Author_Institution

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China

fYear

2015

Firstpage

793

Lastpage

797

Abstract

Action recognition is an important yet challenging task in computer vision. A successful and widely used framework in this field is the Bag of Visual Words (BoVW), wherein the first step is to extract local features. One critical property of local features is that they are often multi-view, e.g., dense trajectory feature includes both appearance and motion properties. Different types of features are aligned together in coding and pooling thus leading the process to be heavily entangled. Our motivation is to disentangle each sub-descriptor and let them contribute to the maximum extent. To achieve this, a codeword net is constructed via exploiting the relation between features and codewords. Based on the codeword net, features from the same viewpoint are pooled together. Experiments on two large scale action recognition datasets, UCF50 and HMDB51, demonstrate that our approach can enhance the state-of-the-art algorithms.

Keywords

"Encoding","Computer vision","Feature extraction","Pattern recognition","Joining processes","Visualization","Trajectory"

Publisher

ieee

Conference_Titel

Image Processing (ICIP), 2015 IEEE International Conference on

Type

conf

DOI

10.1109/ICIP.2015.7350908

Filename

7350908