DocumentCode
2652050
Title
Evaluation of different feature extraction methods for speech recognition in car environment
Author
Wolf, Martin ; Nadeu, Climent
Author_Institution
Dept. of Signal Theor. & Commun., Univ. Politec. de Catalunya, Barcelona
fYear
2008
fDate
25-28 June 2008
Firstpage
359
Lastpage
362
Abstract
In this paper the performance of robust feature extraction techniques for speech recognition is evaluated in a car noise environment. Starting from the basic log mel-scaled filter-bank energies, both the application of the minimum variance distortionless response (MVDR), and the decorrelating transformation (either DCT or frequency filtering (FF)) are considered. In this way, five different types of feature extraction techniques were compared, using the Spanish version of the SDC-Aurora database, either with or without dropping frames labeled as silence by a voice activity detector. According to the results, which were obtained after extensive parameter tuning, the MVDR method is capable of improving slightly the results for both cepstral coefficients (CC) and FF parameters in most tested conditions. On the other hand, the FF-based techniques show a significantly better performance under the high-mismatched conditions than the CC ones (more than 33% of relative improvement). The best average accuracies are resulting from the new MVDR-FF combination.
Keywords
audio databases; cepstral analysis; distortion; feature extraction; filtering theory; speech recognition; SDC-Aurora database; car noise environment; cepstral coefficients; decorrelating transformation; extensive parameter tuning; feature extraction; frequency filtering; mel-scaled filter-bank energies; minimum variance distortionless response; speech recognition; voice activity detector; Decorrelation; Detectors; Discrete cosine transforms; Feature extraction; Filtering; Frequency; Noise robustness; Spatial databases; Speech recognition; Working environment noise; feature extraction; frequency filtering; minimum variance distortionless response; speech recognition in car environment;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Signals and Image Processing, 2008. IWSSIP 2008. 15th International Conference on
Conference_Location
Bratislava
Print_ISBN
978-80-227-2856-0
Electronic_ISBN
978-80-227-2880-5
Type
conf
DOI
10.1109/IWSSIP.2008.4604441
Filename
4604441
Link To Document