• DocumentCode
    50920
  • Title

    Online Non-Negative Convolutive Pattern Learning for Speech Signals

  • Author

    Wang, Dong ; Vipperla, Ravichander ; Evans, Nicholas ; Zheng, Thomas Fang

  • Author_Institution
    Tsinghua Univ., Beijing, China
  • Volume
    61
  • Issue
    1
  • fYear
    2013
  • fDate
    Jan.1, 2013
  • Firstpage
    44
  • Lastpage
    56
  • Abstract
    The unsupervised learning of spectro-temporal patterns within speech signals is of interest in a broad range of applications. Where patterns are non-negative and convolutive in nature, relevant learning algorithms include convolutive non-negative matrix factorization (CNMF) and its sparse alternative, convolutive non-negative sparse coding (CNSC). Both algorithms, however, place unrealistic demands on computing power and memory which prohibit their application in large scale tasks. This paper proposes a new online implementation of CNMF and CNSC which processes input data piece-by-piece and updates learned patterns gradually with accumulated statistics. The proposed approach facilitates pattern learning with huge volumes of training data that are beyond the capability of existing alternatives. We show that, with unlimited data and computing resources, the new online learning algorithm almost surely converges to a local minimum of the objective cost function. In more realistic situations, where the amount of data is large and computing power is limited, online learning tends to obtain lower empirical cost than conventional batch learning.
  • Keywords
    learning (artificial intelligence); matrix decomposition; speech coding; CNMF online implementation; CNSC online implementation; batch learning; convolutive nonnegative matrix factorization; convolutive nonnegative sparse coding; input data piece-by-piece; objective cost function; online learning algorithm; online nonnegative convolutive pattern learning algorithm; pattern learning; spectro-temporal patterns; speech signals; training data; unlimited data; unsupervised learning; Cost function; Databases; Image reconstruction; Speech; Speech processing; Training; Training data; Non-negative matrix factorization; convolutive NMF; online pattern learning; sparse coding; speech processing; speech recognition;
  • fLanguage
    English
  • Journal_Title
    Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1053-587X
  • Type

    jour

  • DOI
    10.1109/TSP.2012.2222381
  • Filename
    6320674