DocumentCode :
1759336
Title :
Stochastic-Deterministic MMSE STFT Speech Enhancement With General A Priori Information
Author :
McCallum, M. ; Guillemin, B.
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Auckland, Auckland, New Zealand
Volume :
21
Issue :
7
fYear :
2013
fDate :
41456
Firstpage :
1445
Lastpage :
1457
Abstract :
A wide range of Bayesian short-time spectral amplitude (STSA) speech enhancement algorithms exist, varying in both the statistical model used for speech and the cost functions considered. Current algorithms of this class consistently assume that the distribution of clean speech short time Fourier transform (STFT) samples are either randomly distributed with zero mean or deterministic. No single distribution function has been considered that captures both deterministic and random signal components. In this paper a Bayesian STSA algorithm is proposed under a stochastic-deterministic (SD) speech model that makes provision for the inclusion of a priori information by considering a non-zero mean. Analytical expressions are derived for the speech STFT magnitude in the MMSE sense, and phase in the maximum-likelihood sense. Furthermore, a practical method of estimating the a priori SD speech model parameters is described based on explicit consideration of harmonically related sinusoidal components in each STFT frame, and variations in both the magnitude and phase of these components between successive STFT frames. Objective tests using the PESQ measure indicate that the proposed algorithm results in superior speech quality when compared to several other speech enhancement algorithms. In particular it is clear that the proposed algorithm has an improved capability to retain low amplitude voiced speech components in low SNR conditions.
Keywords :
Bayes methods; Fourier transforms; least mean squares methods; maximum likelihood estimation; speech enhancement; statistical analysis; Bayesian short-time spectral amplitude algorithms; PESQ measure; SD speech model; STSA algorithm; clean speech short time Fourier transform samples; cost functions; deterministic components; explicit consideration; general a priori information; harmonically related sinusoidal components; low SNR conditions; low amplitude voiced speech components; maximum-likelihood method; nonzero mean; objective tests; random signal components; speech magnitude; speech quality; statistical model; stochastic-deterministic MMSE STFT speech enhancement; zero mean; Discrete Fourier transforms; Estimation; Harmonic analysis; Noise; Speech; Speech enhancement; Stochastic processes; Amplitude estimation; Gaussian processes; minimum mean-square error; phase estimation; speech enhancement; stochastic deterministic model;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2013.2253100
Filename :
6480795
Link To Document :
بازگشت