DocumentCode :
867599
Title :
Incorporating a psychoacoustical model in frequency domain speech enhancement
Author :
Hu, Yi ; Loizou, Philipos C.
Author_Institution :
Dept. of Electr. Eng., Univ. of Texas, Richardson, TX, USA
Volume :
11
Issue :
2
fYear :
2004
Firstpage :
270
Lastpage :
273
Abstract :
A frequency domain optimal linear estimator is proposed which incorporates the masking properties of the human auditory system to make the residual noise distortion inaudible. The use of wavelet-thresholded multitaper spectra is also proposed for frequency-domain speech enhancement methods as an alternative to the traditional fast Fourier transform (FFT)-based magnitude spectra. Experiments with multitalker babble noise indicated that the proposed estimator outperformed the minimum mean-square error log-spectral amplitude estimator (MMSE-LSA), particularly when wavelet-thresholded multitaper spectra were used in place of the FFT spectra.
Keywords :
acoustic noise; frequency-domain analysis; hearing; least mean squares methods; spectral analysis; speech enhancement; MMSE; fast Fourier transform based magnitude spectra; frequency domain optimal linear estimator; frequency domain speech enhancement method; human auditory system; masking property; minimum mean-square error log-spectral amplitude estimator; multitalker babble noise; musical noise; power spectrum estimation; psychoacoustical model; residual noise distortion; wavelet-thresholded multitaper spectra; Amplitude estimation; Auditory system; Fast Fourier transforms; Frequency domain analysis; Frequency estimation; Humans; Noise level; Psychoacoustic models; Psychology; Speech enhancement;
fLanguage :
English
Journal_Title :
Signal Processing Letters, IEEE
Publisher :
ieee
ISSN :
1070-9908
Type :
jour
DOI :
10.1109/LSP.2003.821714
Filename :
1261997
Link To Document :
بازگشت