Speech analysis and recognition using interval statistics generated from a composite auditory model

Author

Sheikhzadeh, H. ; Deng, L.

Author_Institution

Dept. of Electr. & Comput. Eng., Waterloo Univ., Ont., Canada

Volume

6

Issue

1

fYear

1998

fDate

1/1/1998 12:00:00 AM

Firstpage

90

Lastpage

94

Abstract

A modeling approach to auditory speech analysis and recognition is proposed and evaluated, where a composite auditory model is used to generate parallel sets of auditory-nerve instantaneous firing rates (IFRs) along the spatial dimension, followed by a processing stage that constructs from the IFRs the interval statistics in a form called the interpeak interval histogram (IPIH). A speech preprocessor is designed that performs transformation on the auditory IPIHs and interfaces the IPIH-based auditory representation with a hidden Markov model-based (HMM-based) speech recognizer. The results demonstrate that the new preprocessor consistently outperforms the conventional mel frequency cepstral coefficient-based (MFCC-based) preprocessor for the signal-to-noise ratio (SNR) level up to at least 16 dB

Keywords

acoustic signal processing; hearing; hidden Markov models; speech processing; speech recognition; statistical analysis; HMM-based speech recognizer; SNR; auditory representation; auditory speech analysis; auditory-nerve instantaneous firing rates; composite auditory model; hidden Markov model; interpeak interval histogram; interval statistics; mel frequency cepstral coefficient; processing stage; signal-to-noise ratio; spatial dimension; speech preprocessor; Acoustic beams; Acoustic signal processing; Computational complexity; Hidden Markov models; Natural languages; Neural networks; Pattern recognition; Speech analysis; Speech processing; Speech recognition;

fLanguage

English

Journal_Title

Speech and Audio Processing, IEEE Transactions on

Publisher

ieee

ISSN

1063-6676

Type

jour

DOI

10.1109/89.650316

Filename

650316