Gender-dependent and speaker-dependent speech enhancement

Author

Potamitis, I. ; Fakotakis, N. ; Kokkinakis, G.

Author_Institution

Wire Communications Laboratory, Electrical and Computer Engineering Dept., University of Patras, 261 10 Rion, Greece

Volume

1

fYear

2002

fDate

13-17 May 2002

Abstract

Our work introduces a speech enhancement technique that can explicitly incorporate prior information about the gender or speaker time-frequency characteristics in its formalism. We approximate the multimodal, clean speech linear spectrum magnitude with a mixture of Gaussians pdfs using the Expectation-Maximization algorithm (EM). Subsequently. we apply the Bayesian inference framework to the degraded spectral coefficients and by employing Minimum Mean Square Error Estimation (MMSE) we derive a closed fonn solution for the spectral magnitude estimation task adapted to the spectral characteristics and noise variance of each band. We suggest that 2–3 minutes of phonetically balanced non-degraded gender or speaker dependent speech is adequate to tune our algorithm. We demonstrate the benefit of using an enhancement technique tailored to a specific gender or speaker and propose its use in cases where message ambiguity is of critical importance. We evaluate of our algorithm using Lynx helicopter and White Gaussian noise on the task of improving the quality of speech and in combination with a speech coder and demonstrate its robustness at very low SNRs. Implementation code is available at: http://slt.wcl.ee.upatras.gr/potamitis/index.html

Keywords

Distance measurement; Estimation; Noise; Speech; Speech enhancement; Weight measurement;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on

Conference_Location

Orlando, FL, USA

ISSN

1520-6149

Print_ISBN

0-7803-7402-9

Type

conf

DOI

10.1109/ICASSP.2002.5743701

Filename

5743701