DocumentCode :
2518412
Title :
Detecting salient fragments for video human action detection and recognition using an associative memory
Author :
Da-Wei Kuo ; Guan-Yu Cheng ; Shyi-Chyi Cheng ; Su-Ling Lee
Author_Institution :
Dept. of Comput. Sci. & Eng., Nat. Taiwan Ocean Univ. Keelung, Keelung, Taiwan
fYear :
2012
fDate :
2-5 Oct. 2012
Firstpage :
1039
Lastpage :
1044
Abstract :
This paper presents a novel approach to locate action objects in video and recognize their action types simultaneously using an associative memory model. The system uses a preprocessing procedure to extract key-frames from a video sequence and provide a compact representation for this video. Every training key-frame is partitioned into multiple overlapping patches in which image and motion features are extracted to generate an appearance-motion codebook. The training procedure also constructs a two-directional associative memory based on the learnt codebook to facilitate the system detecting and recognizing video action events using salient fragments, patch groups with common motion vectors. Our approach proposes the recently-developed Hough voting model as a framework for human action learning and memory. For each key-frame, the Hough voting framework employs Generalized Hough Transform (GHT) which constructs a graphical structure based on key-frame codewords to learn the mapping between action objects and a Hough space. To determine which patches explicitly represent an action object, the system detects salient fragments whose member patches are used to infer the associative memory and retrieve matched patches from the Hough model. These model patches are then used to locate the target action object and classify the action type simultaneously using a probabilistic Hough voting scheme. Results show that the proposed method gives good performance on several publicly available datasets in terms of detection accuracy and recognition rate.
Keywords :
Hough transforms; content-addressable storage; feature extraction; gesture recognition; image matching; image motion analysis; image representation; image sequences; learning (artificial intelligence); object detection; probability; video coding; GHT; Hough model; Hough space; Hough voting model; action object location; appearance-motion codebook; associative memory model; detection accuracy; generalized Hough transform; graphical structure; human action learning; image feature extraction; key-frame codeword; key-frame extraction; motion feature extraction; motion vector; patch group; patch matching; probabilistic Hough voting scheme; recognition rate; salient fragment detection; two-directional associative memory; video action event detection; video action event recognition; video human action detection; video human action recognition; video representation; video sequence; Associative memory; Deformable models; Feature extraction; Shape; Training; Vectors; Video sequences; Action shapes; Generalized Hough Transform; assicition memory; humen action detection; recognition; salient fragment;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communications and Information Technologies (ISCIT), 2012 International Symposium on
Conference_Location :
Gold Coast, QLD
Print_ISBN :
978-1-4673-1156-4
Electronic_ISBN :
978-1-4673-1155-7
Type :
conf
DOI :
10.1109/ISCIT.2012.6380844
Filename :
6380844
Link To Document :
بازگشت