DocumentCode
454611
Title
Novel Feature Extraction for Noise Robust ASR using the Aurora 2 Database
Author
Hix, Penny ; Zahorian, Stephen ; Meng, Fansheng
Author_Institution
Dept. of Electr. Eng., Old Dominion Univ.
Volume
1
fYear
2006
fDate
14-19 May 2006
Abstract
This paper presents speech signal modeling techniques that are well suited to robust recognition of connected digits in noisy environments. After several preprocessing steps speech is represented by a block-encoding of discrete cosine transform of its spectra. In this paper we combine linear predictive coding (LPC), morphological filtering, and long block lengths to achieve robust features for improved recognition in noisy environments. The spectral envelope is first estimated by LPC. Subsequent morphological filtering enhances the peaks while smoothing the valleys, which are more affected by noise in the signal. These techniques were tested with the Aurora 2 database and the standard HMM recognizer as defined by the ETSI STQ-AURORA DSR Working group for WI007. With no major increase in computational demand a 23% word error rate (WER) reduction has been achieved as compared to the WI007 baseline MFCC front-end for multi-condition training condition. The basic conclusion is that the features resulting from the methods presented here perform better than cepstral features for ASR of noisy speech
Keywords
block codes; discrete cosine transforms; feature extraction; hidden Markov models; linear predictive coding; smoothing methods; speech coding; speech recognition; transform coding; Aurora 2 database; ETSI STQ-AURORA DSR Working group; HMM recognizer; automatic speech recognition; block-encoding; cepstral features; discrete cosine transform; feature extraction; hidden Markov model; linear predictive coding; morphological filtering; noise robust ASR; speech signal modeling techniques; word error rate; Automatic speech recognition; Discrete cosine transforms; Feature extraction; Filtering; Linear predictive coding; Noise robustness; Nonlinear filters; Spatial databases; Speech recognition; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
Conference_Location
Toulouse
ISSN
1520-6149
Print_ISBN
1-4244-0469-X
Type
conf
DOI
10.1109/ICASSP.2006.1660077
Filename
1660077
Link To Document