• DocumentCode
    867599
  • Title

    Incorporating a psychoacoustical model in frequency domain speech enhancement

  • Author

    Hu, Yi ; Loizou, Philipos C.

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Texas, Richardson, TX, USA
  • Volume
    11
  • Issue
    2
  • fYear
    2004
  • Firstpage
    270
  • Lastpage
    273
  • Abstract
    A frequency domain optimal linear estimator is proposed which incorporates the masking properties of the human auditory system to make the residual noise distortion inaudible. The use of wavelet-thresholded multitaper spectra is also proposed for frequency-domain speech enhancement methods as an alternative to the traditional fast Fourier transform (FFT)-based magnitude spectra. Experiments with multitalker babble noise indicated that the proposed estimator outperformed the minimum mean-square error log-spectral amplitude estimator (MMSE-LSA), particularly when wavelet-thresholded multitaper spectra were used in place of the FFT spectra.
  • Keywords
    acoustic noise; frequency-domain analysis; hearing; least mean squares methods; spectral analysis; speech enhancement; MMSE; fast Fourier transform based magnitude spectra; frequency domain optimal linear estimator; frequency domain speech enhancement method; human auditory system; masking property; minimum mean-square error log-spectral amplitude estimator; multitalker babble noise; musical noise; power spectrum estimation; psychoacoustical model; residual noise distortion; wavelet-thresholded multitaper spectra; Amplitude estimation; Auditory system; Fast Fourier transforms; Frequency domain analysis; Frequency estimation; Humans; Noise level; Psychoacoustic models; Psychology; Speech enhancement;
  • fLanguage
    English
  • Journal_Title
    Signal Processing Letters, IEEE
  • Publisher
    ieee
  • ISSN
    1070-9908
  • Type

    jour

  • DOI
    10.1109/LSP.2003.821714
  • Filename
    1261997