• DocumentCode
    2329350
  • Title

    Learning from images and speech with Non-negative Matrix Factorization enhanced by input space scaling

  • Author

    Driesen, Joris ; Van hamme, Hugo ; Kleijn, W. Bastiaan

  • Author_Institution
    Dept. ESAT-PSI, K.U. Leuven, Leuven, Belgium
  • fYear
    2010
  • fDate
    12-15 Dec. 2010
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Computional learning from multimodal data is often done with matrix factorization techniques such as NMF (Non-negative Matrix Factorization), pLSA (Probabilistic Latent Semantic Analysis) or LDA (Latent Dirichlet Allocation). The different modalities of the input are to this end converted into features that are easily placed in a vectorized format. An inherent weakness of such a data representation is that only a subset of these data features actually aids the learning. In this paper, we first describe a simple NMF-based recognition framework operating on speech and image data. We then propose and demonstrate a novel algorithm that scales the inputs of this framework in order to optimize its recognition performance.
  • Keywords
    image recognition; learning (artificial intelligence); matrix decomposition; speech recognition; NMF-based recognition framework; computional learning; data representation; input space scaling; latent dirichlet allocation; multimodal data; nonnegative matrix factorization; probabilistic latent semantic analysis; Feature Selection; Image Recognition; Machine Learning; Multi-modal Learning; Vocabulary Acquisition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language Technology Workshop (SLT), 2010 IEEE
  • Conference_Location
    Berkeley, CA
  • Print_ISBN
    978-1-4244-7904-7
  • Electronic_ISBN
    978-1-4244-7902-3
  • Type

    conf

  • DOI
    10.1109/SLT.2010.5700813
  • Filename
    5700813