DocumentCode
867599
Title
Incorporating a psychoacoustical model in frequency domain speech enhancement
Author
Hu, Yi ; Loizou, Philipos C.
Author_Institution
Dept. of Electr. Eng., Univ. of Texas, Richardson, TX, USA
Volume
11
Issue
2
fYear
2004
Firstpage
270
Lastpage
273
Abstract
A frequency domain optimal linear estimator is proposed which incorporates the masking properties of the human auditory system to make the residual noise distortion inaudible. The use of wavelet-thresholded multitaper spectra is also proposed for frequency-domain speech enhancement methods as an alternative to the traditional fast Fourier transform (FFT)-based magnitude spectra. Experiments with multitalker babble noise indicated that the proposed estimator outperformed the minimum mean-square error log-spectral amplitude estimator (MMSE-LSA), particularly when wavelet-thresholded multitaper spectra were used in place of the FFT spectra.
Keywords
acoustic noise; frequency-domain analysis; hearing; least mean squares methods; spectral analysis; speech enhancement; MMSE; fast Fourier transform based magnitude spectra; frequency domain optimal linear estimator; frequency domain speech enhancement method; human auditory system; masking property; minimum mean-square error log-spectral amplitude estimator; multitalker babble noise; musical noise; power spectrum estimation; psychoacoustical model; residual noise distortion; wavelet-thresholded multitaper spectra; Amplitude estimation; Auditory system; Fast Fourier transforms; Frequency domain analysis; Frequency estimation; Humans; Noise level; Psychoacoustic models; Psychology; Speech enhancement;
fLanguage
English
Journal_Title
Signal Processing Letters, IEEE
Publisher
ieee
ISSN
1070-9908
Type
jour
DOI
10.1109/LSP.2003.821714
Filename
1261997
Link To Document