DocumentCode
1548053
Title
A model of dynamic auditory perception and its application to robust word recognition
Author
Strope, Brian ; Alwan, Abeer
Author_Institution
Dept. of Electr. Eng., California Univ., Los Angeles, CA, USA
Volume
5
Issue
5
fYear
1997
fDate
9/1/1997 12:00:00 AM
Firstpage
451
Lastpage
464
Abstract
This paper describes two mechanisms that augment the common automatic speech recognition (ASR) front end and provide adaptation and isolation of local spectral peaks. A dynamic model consisting of a linear filterbank with a novel additive logarithmic adaptation stage after each filter output is proposed. An extensive series of perceptual forward masking experiments, together with previously reported forward masking data, determine the model´s dynamic parameters. Once parameterized, the simple exponential dynamic mechanism predicts the nature of forward masking data from several studies across wide ranging frequencies, input levels, and probe delay times. An initial evaluation of the dynamic model together with a local peak isolation mechanism as a front end for dynamic time warp (DTW) and hidden Markov model (HMM) word recognition systems shows an improvement in robustness to background noise when compared to Mel-frequency cepstral coefficients (MFCC), linear prediction cepstral coefficients (LPCC), and relative spectra (RASTA) based front ends
Keywords
band-pass filters; filtering theory; hearing; hidden Markov models; noise; parameter estimation; prediction theory; spectral analysis; speech processing; speech recognition; HMM word recognition systems; Mel-frequency cepstral coefficients; additive logarithmic adaptation; automatic speech recognition front end; background noise; dynamic auditory perception; dynamic parameters; exponential dynamic mechanism; filter output; forward masking data; hidden Markov model; input levels; linear filterbank; linear prediction cepstral coefficients; local spectral peaks; local spectral peaks isolation; perceptual forward masking experiments; probe delay times; relative spectra; robust word recognition; Automatic speech recognition; Cepstral analysis; Delay; Filter bank; Frequency; Hidden Markov models; Noise robustness; Nonlinear filters; Predictive models; Probes;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/89.622569
Filename
622569
Link To Document