• DocumentCode
    417222
  • Title

    Speech modeling and voiced/unvoiced/mixed/silence speech segmentation with fractionally Gaussian noise based models

  • Author

    Oveisgharan, Sh ; Shamsollahi, M.B.

  • Author_Institution
    Dept. of Electr. Eng., Sharif Univ. of Technol., Tehran, Iran
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    The ARMA filtered fractionally differenced Gaussian noise (FdGn) model and a new AR filtered FdGn added up model are applied to a speech signal and performance of their parameters on speech unvoiced/voiced/mixed/silence classification is evaluated against the zero crossing rate (ZCR) feature. For parameter estimation of AR filtered FdGn model two methods were applied: the iterative maximum likelihood (ML) method of Tewfik (1993) and a new computationally efficient linear minimum square error (LMSE) algorithm. Also for parameter estimation of the new added up model two approaches were implemented: an expectation-maximization (EM) based approach and an iterative MSE approach. The described models and methods were applied to a speech signal and also its real cepstrum. The performance of the described models on V/U/M/S speech classification was obtained based on the J1 parameter in this order: added up model on real cepstrum of speech, filtered FdGn model on real cepstrum of speech (LMSE method), filtered FdGn model on speech (LMSE method), ZCR, and filtered FdGn model on speech (Tewfik method).
  • Keywords
    Gaussian noise; autoregressive moving average processes; cepstral analysis; feature extraction; iterative methods; least squares approximations; maximum likelihood estimation; signal classification; speech processing; AR filtered FdGn added up model; ARMA filtered fractionally differenced Gaussian noise; EM based approach; FdGn model; ML method; V/U/M/S speech classification; ZCR feature; computationally efficient LMSE algorithm; expectation-maximization based approach; fractionally Gaussian noise based models; iterative MSE; iterative maximum likelihood; linear minimum square error; parameter estimation; performance; speech modeling; speech segmentation; speech signal; speech signal cepstrum; speech unvoiced/voiced/mixed/silence classification; voiced/unvoiced/mixed/silence speech; zero crossing rate; Cepstrum; Filtering; Gaussian noise; Iterative methods; Maximum likelihood estimation; Nonlinear filters; Parameter estimation; Speech analysis; Speech enhancement; White noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326060
  • Filename
    1326060