• DocumentCode
    918762
  • Title

    Filterbank-energy estimation using mixture and Markov models for recognition of noisy speech

  • Author

    Erell, Adoram ; Weintraub, Mitchel

  • Author_Institution
    SRI Inst., Menlo Park, CA, USA
  • Volume
    1
  • Issue
    1
  • fYear
    1993
  • fDate
    1/1/1993 12:00:00 AM
  • Firstpage
    68
  • Lastpage
    76
  • Abstract
    An estimation algorithm for noise robust speech recognition, the minimum mean log spectral distance (MMLSD), is presented. The estimation is matched to the recognizer by seeking to minimize the average distortion as measured by a Euclidean distance between filterbank log-energy vectors, approximating the weighted-cepstral distance used by the recognizer. The estimation is computed using a clean speech spectral probability distribution, estimated from a database, and a stationary, ARMA model for the noise. When trained on clean speech and tested with additive white noise at 10-dB SNR, the recognition accuracy with the MMLSD algorithm is comparable to that achieved with training the recognizer at the same constant 10-dB SNR. The algorithm is also highly efficient with a quasi-stationary environmental noise, recorded with a desktop microphone, and requires almost no tuning to differences between this noise and the computer-generated white noise
  • Keywords
    Markov processes; filtering and prediction theory; speech recognition; white noise; 10 dB; ARMA model; Euclidean distance; MMLSD algorithm; Markov models; SNR; additive white noise; average distortion; clean speech; computer-generated white noise; database; desktop microphone; estimation algorithm; filterbank energy estimation; filterbank log-energy vectors; minimum mean log spectral distance algorithm; noise robust speech recognition; noisy speech; quasistationary environmental noise; recognition accuracy; speech spectral probability distribution; weighted-cepstral distance; Distortion measurement; Distributed computing; Euclidean distance; Filter bank; Noise robustness; Probability distribution; Signal to noise ratio; Speech enhancement; Speech recognition; Working environment noise;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.221385
  • Filename
    221385