DocumentCode :
149033
Title :
Gaussian Power flow Orientation Coefficients for noise-robust speech recognition
Author :
Gerazov, Branislav ; Ivanovski, Zoran
Author_Institution :
Fac. of Electr. Eng. & Inf. Technol., Ss. Cyril & Methodius Univ. Skopje, Skopje, Macedonia
fYear :
2014
fDate :
1-5 Sept. 2014
Firstpage :
1467
Lastpage :
1471
Abstract :
Spectro-temporal features have shown a great promise in respect to improving the noise-robustness of Automatic Speech Recognition (ASR) systems. The common approach uses a bank of 2D Gabor filters to process the speech signal spectrogram and generate the output feature vector. This approach suffers from generating a large number of coefficients, thus necessitating the use of feature dimensionality reduction. The proposed Gaussian Power flow Orientation Coefficients (GPOCs) use an alternative approach in which only the largest coefficients output from a bank of 2D Gaussian kernels are used to describe the spectro-temporal patterns of power flow in the auditory spectrogram. Whilst reducing the size of the feature vectors, the algorithm was shown to outperform traditional feature extraction methods, even a reference spectro-temporal approach, for low SNRs. Its performance for high SNRs is comparable but inferior to traditional ASR frontends, while falling behind state-of-the-art algorithms in all noise scenarios.
Keywords :
Gabor filters; Gaussian processes; channel bank filters; feature extraction; speech recognition; 2D Gabor filter bank; 2D Gaussian kernel bank; ASR frontends; ASR system; GPOCs; Gaussian power flow orientation coefficients; SNRs; auditory spectrogram; automatic speech recognition systems; feature dimensionality reduction; feature extraction methods; feature vector size reduction; noise-robust speech recognition; output feature vector generation; reference spectro-temporal approach; spectro-temporal features; speech signal spectrogram processing; Feature extraction; Gabor filters; Kernel; Load flow; Spectrogram; Speech; Training; 2D Gaussian; ASR; kernel; noise-robust; spectro-temporal;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European
Conference_Location :
Lisbon
Type :
conf
Filename :
6952533
Link To Document :
بازگشت