مرکز منطقه ای اطلاع رساني علوم و فناوري - Exploiting the baseband phase structure of the voiced speech for speech enhancement

DocumentCode :

179789

Title :

Exploiting the baseband phase structure of the voiced speech for speech enhancement

Author :

Patil, Sumit Prakash ; Gowdy, John N.

Author_Institution :

Dept. of Electr. & Comput. Eng., Clemson Univ., Clemson, SC, USA

fYear :

2014

fDate :

4-9 May 2014

Firstpage :

6092

Lastpage :

6096

Abstract :

Performance of traditional speech enhancement techniques like spectral subtraction and log-Minimum Mean Squared Error Short Time Spectral Amplitude (log-MMSE STSA) estimation degrades in presence of highly non-stationary noises like babble noise. This is mainly due to inaccurate noise estimation during the voiced segment of the speech signal. In this paper, we propose to exploit the fine structure of the phase spectra of the voiced speech in the baseband STFT domain. This phase structure is used to detect the noise dominant frequency bins in the voiced frames. This information is used to achieve better non-stationary noise Power Spectral Density (PSD) estimation. Using this estimation, performance of spectral subtraction and log-MMSE STSA is improved overall by 0.3 and 0.2, respectively, in terms of Perceptual Evaluation of Speech Quality (PESQ) measure over the original algorithms when noisy speech is used for pitch estimation. We also present the combination of these two algorithms (spectral subtraction and log-MMSE STSA) to achieve the overall PESQ improvement of 0.5 over standard log-MMSE STSA when accurate pitch estimation is available.

Keywords :

least mean squares methods; phase estimation; speech enhancement; PESQ measure; PSD estimation; babble noise; baseband STFT domain; baseband phase structure; log-MMSE STSA estimation; log-minimum mean squared error short time spectral amplitude; noise dominant frequency bin detection; nonstationary noise power spectral density estimation; perceptual evaluation of speech quality measure; phase estimation; phase spectra; pitch estimation; spectral subtraction; speech enhancement techniques; voiced speech; Estimation; Harmonic analysis; Noise; Noise measurement; Speech; Speech enhancement; PESQ; Phase estimation; speech enhancement;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on

Conference_Location :

Florence

Type :

conf

DOI :

10.1109/ICASSP.2014.6854774

Filename :

6854774

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=179789