A Feature for Voice Activity Detection Derived from Speech Analysis with the Exponential Autoregressive Model

Author

Ishizuka, Kentaro ; Kato, Hiroko

Author_Institution

NTT Commun. Sci. Lab., NTT Corp.

Volume

1

fYear

2006

fDate

14-19 May 2006

Abstract

This paper proposes a feature for voice activity detection (VAD) obtained from a speech signal analysis that uses the exponential autoregressive (ExpAR) model. This model employs exponential terms that depend on the amplitude of observed signals in the AR coefficients part. Since these terms can model the nonlinearity of speech caused by the nonlinear fluctuation of vocal cord vibration, this model can provide a better fit for speech signals. A parameter in the exponential terms of the ExpAR model called ´the scaling parameter,´ is directly associated with the degree of nonlinearity of analyzed signals. Therefore, the scaling parameter changes when observed signals include speech signals. Based on this property, this parameter is usable as a feature for VAD under noisy conditions. An experiment using noisy speech data confirmed the potential performance of the proposed feature by comparing receiver operating characteristics curves obtained from the proposed feature and conventional robust features. Another experiment was conducted by comparing recalls, precisions, and F-measures for speech interval detection achieved by our proposed VAD algorithm, that utilized only the proposed feature, and two widely used standardized algorithms. The result showed that the proposed method could achieve better performance than those of the standardized algorithms

Keywords

autoregressive processes; speech processing; vibrations; exponential autoregressive model; noisy speech data; nonlinear fluctuation; speech interval detection; speech signal analysis; vocal cord vibration; voice activity detection; Fluctuations; Laboratories; Robustness; Signal analysis; Signal processing algorithms; Signal to noise ratio; Speech analysis; Speech coding; Speech enhancement; Working environment noise;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on

Conference_Location

Toulouse

ISSN

1520-6149

Print_ISBN

1-4244-0469-X

Type

conf

DOI

10.1109/ICASSP.2006.1660139

Filename

1660139