• DocumentCode
    1217751
  • Title

    Feature Enhancement for Noisy Speech Recognition With a Time-Variant Linear Predictive HMM Structure

  • Author

    Deng, Jianping ; Bouchard, Martin ; Yeap, Tet Hin

  • Author_Institution
    Sch. of Inf. Technol. & Eng., Univ. of Ottawa, Ottawa, ON
  • Volume
    16
  • Issue
    5
  • fYear
    2008
  • fDate
    7/1/2008 12:00:00 AM
  • Firstpage
    891
  • Lastpage
    899
  • Abstract
    This paper presents a new approach for speech feature enhancement in the log-spectral domain for noisy speech recognition. A switching linear dynamic model (SLDM) is explored as a parametric model for the clean speech distribution. Each multivariate linear dynamic model (LDM) is associated with the hidden state of a hidden Markov model (HMM) as an attempt to describe the temporal correlations among adjacent frames of speech features. The state transition on the Markov chain is the process of activating a different LDM or activating some of them simultaneously by different probabilities generated by the HMM. Rather than holding a transition probability for the whole process, a connectionist model is employed to learn the time variant transition probabilities. With the resulting SLDM as the speech model and with a model for the noise, speech and noise are jointly tracked by means of switching Kalman filtering. Comprehensive experiments are carried out using the Aurora2 database to evaluate the new algorithm. The results show that the new SLDM approach can further improve the speech feature enhancement performance in terms of noise-robust recognition accuracy, since the transition probabilities among the LDMs can be described more precisely at each time point.
  • Keywords
    Kalman filters; hidden Markov models; speech enhancement; speech recognition; Aurora2 database; Markov chain; clean speech distribution; hidden Markov model; log-spectral domain; multivariate linear dynamic model; noisy speech recognition; speech feature enhancement; switching Kalman filtering; switching linear dynamic model; temporal correlations; time variant transition probabilities; transition probabilities; Speech feature enhancement; speech recognition; switching linear dynamic models (SLDMs); time-variant linear predictive hidden Markov model (HMM);
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2004.924593
  • Filename
    4519798