Title :
On properties of modulation spectrum for robust automatic speech recognition
Author :
Kanedera, Noboru ; Hermansky, Hynek ; Arai, Takayuki
Author_Institution :
Ishikawa Nat. Coll. of Technol., Japan
Abstract :
We report on the effect of band-pass filtering of the time trajectories of spectral envelopes on speech recognition. Several types of filter (linear-phase FIR, DCT, and DFT) are studied. Results indicate the relative importance of different components of the modulation spectrum of speech for ASR. General conclusions are: (1) most of the useful linguistic information is in modulation frequency components from the range between 1 and 16 Hz, with the dominant component at around 4 Hz, (2) it is important to preserve the phase information in the modulation frequency domain, (3) the features which include components at around 4 Hz in the modulation spectrum outperform the conventional delta features, (4) the features which represent the several modulation frequency bands with appropriate center frequency and bandwidth increase recognition performance
Keywords :
FIR filters; band-pass filters; cepstral analysis; discrete Fourier transforms; discrete cosine transforms; filtering theory; frequency modulation; speech recognition; 1 to 16 Hz; DCT filter; DFT filter; band-pass filtering; linear-phase FIR filter; modulation frequency components; modulation frequency domain; modulation spectrum; phase information; robust automatic speech recognition; spectral envelopes; time trajectories; useful linguistic information; Automatic speech recognition; Band pass filters; Delta modulation; Discrete cosine transforms; Filtering; Finite impulse response filter; Frequency modulation; Phase modulation; Robustness; Speech recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing, 1998. Proceedings of the 1998 IEEE International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7803-4428-6
DOI :
10.1109/ICASSP.1998.675339