• DocumentCode
    454660
  • Title

    A Feature for Voice Activity Detection Derived from Speech Analysis with the Exponential Autoregressive Model

  • Author

    Ishizuka, Kentaro ; Kato, Hiroko

  • Author_Institution
    NTT Commun. Sci. Lab., NTT Corp.
  • Volume
    1
  • fYear
    2006
  • fDate
    14-19 May 2006
  • Abstract
    This paper proposes a feature for voice activity detection (VAD) obtained from a speech signal analysis that uses the exponential autoregressive (ExpAR) model. This model employs exponential terms that depend on the amplitude of observed signals in the AR coefficients part. Since these terms can model the nonlinearity of speech caused by the nonlinear fluctuation of vocal cord vibration, this model can provide a better fit for speech signals. A parameter in the exponential terms of the ExpAR model called ´the scaling parameter,´ is directly associated with the degree of nonlinearity of analyzed signals. Therefore, the scaling parameter changes when observed signals include speech signals. Based on this property, this parameter is usable as a feature for VAD under noisy conditions. An experiment using noisy speech data confirmed the potential performance of the proposed feature by comparing receiver operating characteristics curves obtained from the proposed feature and conventional robust features. Another experiment was conducted by comparing recalls, precisions, and F-measures for speech interval detection achieved by our proposed VAD algorithm, that utilized only the proposed feature, and two widely used standardized algorithms. The result showed that the proposed method could achieve better performance than those of the standardized algorithms
  • Keywords
    autoregressive processes; speech processing; vibrations; exponential autoregressive model; noisy speech data; nonlinear fluctuation; speech interval detection; speech signal analysis; vocal cord vibration; voice activity detection; Fluctuations; Laboratories; Robustness; Signal analysis; Signal processing algorithms; Signal to noise ratio; Speech analysis; Speech coding; Speech enhancement; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on
  • Conference_Location
    Toulouse
  • ISSN
    1520-6149
  • Print_ISBN
    1-4244-0469-X
  • Type

    conf

  • DOI
    10.1109/ICASSP.2006.1660139
  • Filename
    1660139