Title :
A comparison of auditory features for robust speech recognition
Author :
Kelly, Finnian ; Harte, Naomi
Author_Institution :
Dept. of Electron. & Electr. Eng., Sigmedia Group, Trinity Coll. Dublin, Dublin, Ireland
Abstract :
This paper presents a detailed comparison of the performance of two auditory based feature extraction algorithms for automatic speech recognition (ASR). The feature sets are Zero-Crossings with Peak Amplitudes (ZCPA) and the recently introduced Power-Law Nonlinearity and Power-Bias Subtraction (PNCC). Standard Mel-Frequency Cepstral Coefficients (MFCC) are also tested for comparison. Although front-ends have been compared in previous papers, this work focuses on two of the most promising algorithms for noise robustness. The performance of all features is reported on the TIMIT database using a HMM system. It is found that the PNCC features outperform MFCC in clean conditions and are robust to noise. ZCPA performance is shown to vary widely with filterbank configuration and frame length. The ZCPA performance is poor in clean conditions but is the least affected by white noise. PNCC is shown to be the most promising new feature set for robust ASR in recent years.
Keywords :
channel bank filters; feature extraction; hidden Markov models; speech recognition; ASR; HMM system; MFCC; Mel-frequency cepstral coefficients; PNCC; TIMIT database; auditory based feature extraction; automatic speech recognition; filterbank configuration; peak amplitudes; power bias subtraction; power law nonlinearity; robust speech recognition; zero crossings; Accuracy; Feature extraction; Finite impulse response filters; Histograms; Mel frequency cepstral coefficient; Speech; Speech recognition;
Conference_Titel :
Signal Processing Conference, 2010 18th European
Conference_Location :
Aalborg