DocumentCode
639012
Title
Multi-modal GM-plsa and its application to video classification
Author
Cencen Zhong ; Zhenjiang Miao
Author_Institution
Inst. of Inf. Sci., Beijing Jiaotong Univ., Beijing, China
fYear
2013
fDate
15-19 July 2013
Firstpage
1
Lastpage
4
Abstract
To extend standard probabilistic Latent Semantic Analysis (pLSA) to handle continuous quantity, pLSA with Gaussian Mixtures (GM-pLSA) has been proposed, which models the continuous features of terms via a Gaussian Mixture Model (GMM). Stemming from GM-pLSA, this paper presents a multi-modal GM-pLSA (MMGM-pLSA) model to deal with the situation where continuous features from multiple modalities are extracted from one term. Based on our assumption that the multi-modal features of one term independently come from the same latent aspect, multiple GMMs are introduced with each of them depicting the feature distribution of each modality. By doing so, the characteristic of each modality is captured and embodied. To evaluate the performance, a prototype of typical video classification is devised, in which each video clip is interpreted as one document and its sub-shots as terms. Experimental comparisons with other approaches demonstrate the effectiveness of MMGM-pLSA.
Keywords
Gaussian processes; feature extraction; image classification; probability; statistical analysis; video signal processing; Gaussian mixtures; continuous feature extraction; feature distribution depiction; multimodal GM-pLSA; performance evaluation; probabilistic latent semantic analysis; video classification; video clip; Accuracy; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Semantics; Standards; Visualization; multiple modalities; pLSA with Gaussian Mixtures (GM-pLSA); video classification;
fLanguage
English
Publisher
ieee
Conference_Titel
Multimedia and Expo Workshops (ICMEW), 2013 IEEE International Conference on
Conference_Location
San Jose, CA
Type
conf
DOI
10.1109/ICMEW.2013.6618306
Filename
6618306
Link To Document