Title :
Speech modeling and voiced/unvoiced/mixed/silence speech segmentation with fractionally Gaussian noise based models
Author :
Oveisgharan, Sh ; Shamsollahi, M.B.
Author_Institution :
Dept. of Electr. Eng., Sharif Univ. of Technol., Tehran, Iran
Abstract :
The ARMA filtered fractionally differenced Gaussian noise (FdGn) model and a new AR filtered FdGn added up model are applied to a speech signal and performance of their parameters on speech unvoiced/voiced/mixed/silence classification is evaluated against the zero crossing rate (ZCR) feature. For parameter estimation of AR filtered FdGn model two methods were applied: the iterative maximum likelihood (ML) method of Tewfik (1993) and a new computationally efficient linear minimum square error (LMSE) algorithm. Also for parameter estimation of the new added up model two approaches were implemented: an expectation-maximization (EM) based approach and an iterative MSE approach. The described models and methods were applied to a speech signal and also its real cepstrum. The performance of the described models on V/U/M/S speech classification was obtained based on the J1 parameter in this order: added up model on real cepstrum of speech, filtered FdGn model on real cepstrum of speech (LMSE method), filtered FdGn model on speech (LMSE method), ZCR, and filtered FdGn model on speech (Tewfik method).
Keywords :
Gaussian noise; autoregressive moving average processes; cepstral analysis; feature extraction; iterative methods; least squares approximations; maximum likelihood estimation; signal classification; speech processing; AR filtered FdGn added up model; ARMA filtered fractionally differenced Gaussian noise; EM based approach; FdGn model; ML method; V/U/M/S speech classification; ZCR feature; computationally efficient LMSE algorithm; expectation-maximization based approach; fractionally Gaussian noise based models; iterative MSE; iterative maximum likelihood; linear minimum square error; parameter estimation; performance; speech modeling; speech segmentation; speech signal; speech signal cepstrum; speech unvoiced/voiced/mixed/silence classification; voiced/unvoiced/mixed/silence speech; zero crossing rate; Cepstrum; Filtering; Gaussian noise; Iterative methods; Maximum likelihood estimation; Nonlinear filters; Parameter estimation; Speech analysis; Speech enhancement; White noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1326060