Title :
Feature Extraction Based on Pitch-Synchronous Averaging for Robust Speech Recognition
Author :
Morales-Cordovilla, Juan A. ; Peinado, Antonio M. ; Sánchez, Victoria ; González, José A.
Author_Institution :
Dept. of Teor. de la Senal Telematica y Comun., Univ. de Granada, Granada, Spain
fDate :
3/1/2011 12:00:00 AM
Abstract :
In this paper, we propose two estimators for the autocorrelation sequence of a periodic signal in additive noise. Both estimators are formulated employing tables which contain all the possible products of sample pairs in a speech signal frame. The first estimator is based on a pitch-synchronous averaging. This estimator is statistically analyzed and we show that the signal-to-noise ratio (SNR) can be increased up to a factor equal to the number of available periods. The second estimator is similar to the former one but it avoids the use of those sample products more likely affected by noise. We prove that, under certain conditions, this estimator can remove the effect of an additive noise in a statistical sense. Both estimators are employed to extract mel frequency cepstral coefficients (MFCCs) as features for robust speech recognition. Although these estimators are initially conceived for voiced speech frames, we extend their application to unvoiced sounds in order to obtain a coherent feature extractor. The experimental results show the superiority of the proposed approach over other MFCC-based front-ends such as the higher-lag autocorrelation spectrum estimation (HASE), which also employs the idea of avoiding those autocorrelation coefficients more likely affected by noise.
Keywords :
cepstral analysis; correlation methods; feature extraction; speech recognition; autocorrelation coefficient; autocorrelation sequence; feature extraction; higher lag autocorrelation spectrum estimation; mel frequency cepstral coefficient; periodic signal; pitch synchronous averaging; robust speech recognition; signal-to-noise ratio; speech signal frame; voiced speech frames; Acoustic noise; Additive noise; Autocorrelation; Feature extraction; Frequency estimation; Mel frequency cepstral coefficient; Noise robustness; Signal to noise ratio; Spectral analysis; Speech recognition; Acoustic noise; autocorrelation estimation; autocorrelation-based mel frequency cepstral coefficient (AMFCC); pitch-synchronous analysis; robust speech recognition;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2010.2053846