Title :
Improving codebook generation for action recognition using a mixture of Asymmetric Gaussians
Author :
Elguebaly, Tarek ; Bouguila, Nizar
Author_Institution :
Dept. of Electr. & Comput. Eng., Concordia Univ., Montreal, QC, Canada
Abstract :
Human activity recognition is a crucial area of computer vision research and applications. The goal of human activity recognition aims to automatically analyze and interpret ongoing events and their context from video data. Recently, the bag of visual words (BoVW) approach has been widely applied for human action recognition. Generally, a representative corpus of videos is used to build the Visual Words dictionary or codebook using a simple k-means clustering approach. This visual dictionary is then used to quantize the extracted features by simply assigning the label of the closest cluster centroid using Euclidean distance between the cluster centers and the input descriptor. Thus, each video can be represented as a frequency histogram over visual words. However, the BoVW approach has several limitations such as its need for a predefined codebook size, dependence on the chosen set of visual words, and the use of hard assignment clustering for histogram creation. In this paper, we are trying to overcome these issues by using a mixture of Asymmetric Gaussians to build the codebook. Our method is able to identify the best size for our dictionary in an unsupervised manner, to represent the set of input feature vectors by an estimate of their density distribution, and to allow soft assignments. Furthermore, we validate the efficiency of the proposed algorithm for human action recognition.
Keywords :
Gaussian processes; image motion analysis; image recognition; image representation; mixture models; video coding; BoVW approach; asymmetric Gaussian mixture; bag of visual words; codebook generation improvement; density distribution; feature vector representation; human action recognition; soft assignments; video representation; visual words dictionary; Detectors; Dictionaries; Feature extraction; Hidden Markov models; Histograms; Vectors; Visualization; Gaussian mixture; Unsupervised learning; expectation-maximization; human action recognition;
Conference_Titel :
Computational Intelligence for Multimedia, Signal and Vision Processing (CIMSIVP), 2014 IEEE Symposium on
Conference_Location :
Orlando, FL
DOI :
10.1109/CIMSIVP.2014.7013267