DocumentCode
2329350
Title
Learning from images and speech with Non-negative Matrix Factorization enhanced by input space scaling
Author
Driesen, Joris ; Van hamme, Hugo ; Kleijn, W. Bastiaan
Author_Institution
Dept. ESAT-PSI, K.U. Leuven, Leuven, Belgium
fYear
2010
fDate
12-15 Dec. 2010
Firstpage
1
Lastpage
6
Abstract
Computional learning from multimodal data is often done with matrix factorization techniques such as NMF (Non-negative Matrix Factorization), pLSA (Probabilistic Latent Semantic Analysis) or LDA (Latent Dirichlet Allocation). The different modalities of the input are to this end converted into features that are easily placed in a vectorized format. An inherent weakness of such a data representation is that only a subset of these data features actually aids the learning. In this paper, we first describe a simple NMF-based recognition framework operating on speech and image data. We then propose and demonstrate a novel algorithm that scales the inputs of this framework in order to optimize its recognition performance.
Keywords
image recognition; learning (artificial intelligence); matrix decomposition; speech recognition; NMF-based recognition framework; computional learning; data representation; input space scaling; latent dirichlet allocation; multimodal data; nonnegative matrix factorization; probabilistic latent semantic analysis; Feature Selection; Image Recognition; Machine Learning; Multi-modal Learning; Vocabulary Acquisition;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location
Berkeley, CA
Print_ISBN
978-1-4244-7904-7
Electronic_ISBN
978-1-4244-7902-3
Type
conf
DOI
10.1109/SLT.2010.5700813
Filename
5700813
Link To Document