DocumentCode
417222
Title
Speech modeling and voiced/unvoiced/mixed/silence speech segmentation with fractionally Gaussian noise based models
Author
Oveisgharan, Sh ; Shamsollahi, M.B.
Author_Institution
Dept. of Electr. Eng., Sharif Univ. of Technol., Tehran, Iran
Volume
1
fYear
2004
fDate
17-21 May 2004
Abstract
The ARMA filtered fractionally differenced Gaussian noise (FdGn) model and a new AR filtered FdGn added up model are applied to a speech signal and performance of their parameters on speech unvoiced/voiced/mixed/silence classification is evaluated against the zero crossing rate (ZCR) feature. For parameter estimation of AR filtered FdGn model two methods were applied: the iterative maximum likelihood (ML) method of Tewfik (1993) and a new computationally efficient linear minimum square error (LMSE) algorithm. Also for parameter estimation of the new added up model two approaches were implemented: an expectation-maximization (EM) based approach and an iterative MSE approach. The described models and methods were applied to a speech signal and also its real cepstrum. The performance of the described models on V/U/M/S speech classification was obtained based on the J1 parameter in this order: added up model on real cepstrum of speech, filtered FdGn model on real cepstrum of speech (LMSE method), filtered FdGn model on speech (LMSE method), ZCR, and filtered FdGn model on speech (Tewfik method).
Keywords
Gaussian noise; autoregressive moving average processes; cepstral analysis; feature extraction; iterative methods; least squares approximations; maximum likelihood estimation; signal classification; speech processing; AR filtered FdGn added up model; ARMA filtered fractionally differenced Gaussian noise; EM based approach; FdGn model; ML method; V/U/M/S speech classification; ZCR feature; computationally efficient LMSE algorithm; expectation-maximization based approach; fractionally Gaussian noise based models; iterative MSE; iterative maximum likelihood; linear minimum square error; parameter estimation; performance; speech modeling; speech segmentation; speech signal; speech signal cepstrum; speech unvoiced/voiced/mixed/silence classification; voiced/unvoiced/mixed/silence speech; zero crossing rate; Cepstrum; Filtering; Gaussian noise; Iterative methods; Maximum likelihood estimation; Nonlinear filters; Parameter estimation; Speech analysis; Speech enhancement; White noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326060
Filename
1326060
Link To Document