DocumentCode
542303
Title
A comparison of front-end configurations for robust speech recognition
Author
Milner, Ben
Author_Institution
School of Information Systems, University of East Anglia, Norwich, UK
Volume
1
fYear
2002
fDate
13-17 May 2002
Abstract
This paper presents a comparative analysis of the processing stages involved in feature extraction for speech recognition. Feature extraction is considered as comprising three different processing stages; namely static feature extraction, normalisation and inclusion of temporal information. In each stage a comparison of techniques is made, both theoretically and in terms of their comparative performance. The analysis shows that while some techniques may appear significantly different, upon analysis the effect they have on the signal can be similar. Comparative studies include MFCC and PLP analysis, RASTA filtering and cepstral mean normalisation, and temporal derivatives and cepstral-time matrices. Experimental results, on an unconstrained monophone task, compare recognition performance using different front-end configurations.
Keywords
Acceleration; Cutoff frequency; Filtering; Mel frequency cepstral coefficient; Robustness; Speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location
Orlando, FL, USA
ISSN
1520-6149
Print_ISBN
0-7803-7402-9
Type
conf
DOI
10.1109/ICASSP.2002.5743838
Filename
5743838
Link To Document