DocumentCode :
705090
Title :
A comparison of auditory features for robust speech recognition
Author :
Kelly, Finnian ; Harte, Naomi
Author_Institution :
Dept. of Electron. & Electr. Eng., Sigmedia Group, Trinity Coll. Dublin, Dublin, Ireland
fYear :
2010
fDate :
23-27 Aug. 2010
Firstpage :
1968
Lastpage :
1972
Abstract :
This paper presents a detailed comparison of the performance of two auditory based feature extraction algorithms for automatic speech recognition (ASR). The feature sets are Zero-Crossings with Peak Amplitudes (ZCPA) and the recently introduced Power-Law Nonlinearity and Power-Bias Subtraction (PNCC). Standard Mel-Frequency Cepstral Coefficients (MFCC) are also tested for comparison. Although front-ends have been compared in previous papers, this work focuses on two of the most promising algorithms for noise robustness. The performance of all features is reported on the TIMIT database using a HMM system. It is found that the PNCC features outperform MFCC in clean conditions and are robust to noise. ZCPA performance is shown to vary widely with filterbank configuration and frame length. The ZCPA performance is poor in clean conditions but is the least affected by white noise. PNCC is shown to be the most promising new feature set for robust ASR in recent years.
Keywords :
channel bank filters; feature extraction; hidden Markov models; speech recognition; ASR; HMM system; MFCC; Mel-frequency cepstral coefficients; PNCC; TIMIT database; auditory based feature extraction; automatic speech recognition; filterbank configuration; peak amplitudes; power bias subtraction; power law nonlinearity; robust speech recognition; zero crossings; Accuracy; Feature extraction; Finite impulse response filters; Histograms; Mel frequency cepstral coefficient; Speech; Speech recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference, 2010 18th European
Conference_Location :
Aalborg
ISSN :
2219-5491
Type :
conf
Filename :
7096363
Link To Document :
بازگشت