DocumentCode :
945183
Title :
Voice activity detection based on multiple statistical models
Author :
Chang, Joon-Hyuk ; Kim, Nam Soo ; Mitra, Sanjit K.
Author_Institution :
Dept. of Electron. Eng., Inha Univ., Incheon, South Korea
Volume :
54
Issue :
6
fYear :
2006
fDate :
6/1/2006 12:00:00 AM
Firstpage :
1965
Lastpage :
1976
Abstract :
One of the key issues in practical speech processing is to achieve robust voice activity detection (VAD) against the background noise. Most of the statistical model-based approaches have tried to employ the Gaussian assumption in the discrete Fourier transform (DFT) domain, which, however, deviates from the real observation. In this paper, we propose a class of VAD algorithms based on several statistical models. In addition to the Gaussian model, we also incorporate the complex Laplacian and Gamma probability density functions to our analysis of statistical properties. With a goodness-of-fit tests, we analyze the statistical properties of the DFT spectra of the noisy speech under various noise conditions. Based on the statistical analysis, the likelihood ratio test under the given statistical models is established for the purpose of VAD. Since the statistical characteristics of the speech signal are differently affected by the noise types and levels, to cope with the time-varying environments, our approach is aimed at finding adaptively an appropriate statistical model in an online fashion. The performance of the proposed VAD approaches in both the stationary and nonstationary noise environments is evaluated with the aid of an objective measure.
Keywords :
Gaussian processes; discrete Fourier transforms; signal detection; speech processing; statistical analysis; DFT; Gamma probability density functions; Gaussian assumption; Laplacian probability density functions; background noise; discrete Fourier transform; goodness-of-fit tests; statistical models; voice activity detection; Background noise; Discrete Fourier transforms; Laplace equations; Noise robustness; Probability density function; Speech analysis; Speech enhancement; Speech processing; Testing; Working environment noise; Discrete cosine transform (DCT); generalized gamma function; maximum likelihood;
fLanguage :
English
Journal_Title :
Signal Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1053-587X
Type :
jour
DOI :
10.1109/TSP.2006.874403
Filename :
1634796
Link To Document :
بازگشت