DocumentCode :
2255689
Title :
Lombard effect compensation and noise suppression for noisy Lombard speech recognition
Author :
Chi, Sang-Mun ; Oh, Yung-Hwan
Author_Institution :
Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Taejon, South Korea
Volume :
4
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
2013
Abstract :
The performance of a speech recognition system degrades rapidly in the presence of ambient noise. To reduce the degradation, a degradation model is proposed which represents the spectral changes in a speech signal uttered in a noisy environment. The model uses frequency warping and amplitude scaling of each frequency band to simulate the variations of formant location, formant bandwidth, pitch, spectral tilt and energy in each frequency band by the Lombard effect. Another Lombard effect-the variation of overall vocal intensity-is represented by a multiplicative constant term depending on the spectral magnitude of the input speech. The noise contamination is represented by an additive term in the frequency domain. According to this degradation model, the cepstral vector of clean speech is estimated from that of noisy-Lombard speech using spectral subtraction, spectral magnitude normalization, band-pass filtering in the Lin-Log spectral domain, and multiple linear transformations. Noisy Lombard speech data is collected by simulating noisy environments using noises from automobiles, an exhibition hall, telephone booths in downtown crowded streets, and computer rooms. The proposed method significantly reduces error rates in the recognition of 50 Korean words. For example, the recognition rate is 95.91% with this method and 79.68% without this method at an SNR (signal-to-noise ratio) 10 dB
Keywords :
acoustic filters; acoustic noise; band-pass filters; compensation; frequency-domain analysis; noise abatement; speech recognition; Korean word recognition; Lin-Log spectral domain; Lombard effect compensation; SNR; additive term; ambient noise; amplitude scaling; automobiles; band-pass filtering; cepstral vector; computer rooms; crowded streets; energy; exhibition hall; formant bandwidth; formant location; frequency bands; frequency warping; input speech spectral magnitude; multiple linear transformations; multiplicative constant term; noise contamination; noise suppression; noisy Lombard speech recognition; noisy environments; performance degradation; pitch; recognition error rates; spectral magnitude normalization; spectral subtraction; spectral tilt; speech signal spectral changes; telephone booths; vocal intensity variation; Additive noise; Bandwidth; Cepstral analysis; Contamination; Degradation; Frequency domain analysis; Noise reduction; Speech recognition; Vectors; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607193
Filename :
607193
Link To Document :
بازگشت