Detection of human speech using hybrid recognition models

Author

Hoyt, John D. ; Wechsler, Harry

Author_Institution

Eng. Res. Facility, Federal Bureau of Investigation, Quantico, VA, USA

Volume

2

fYear

1994

fDate

9-13 Oct 1994

Firstpage

330

Abstract

This paper describes the research to develop an efficient system that provides a binary decision as to the presence of speech in a short time sample of an acoustic signal. A method which is efficient and reliably detects human speech in the presence of structured noise (such as wind, music, traffic sounds, etc.) is described. There are methods which work well to detect speech in a communications environment, but previous methods can not distinguish speech from a quasi-periodic signal that have a spectral power density similar to speech (such as music). Two separate feature sets are evaluated, reliable detection is obtained down to signal to noise ratios (SNR) as low as 0 dB. The algorithm utilized is a statistical pattern classifier with radial basis function networks. Mel-cepstra and wavelet feature vectors are compared. A method of obtaining the temporal feature information is also described

Keywords

speech recognition; binary decision; human speech detection; mel-cepstra; quasi-periodic signal; radial basis function networks; spectral power density; statistical pattern classifier; structured noise; wavelet feature vectors; Acoustic noise; Acoustic signal detection; Humans; Multiple signal classification; Music; Power system reliability; Signal to noise ratio; Speech enhancement; Speech recognition; Working environment noise;

fLanguage

English

Publisher

ieee

Conference_Titel

Pattern Recognition, 1994. Vol. 2 - Conference B: Computer Vision & Image Processing., Proceedings of the 12th IAPR International. Conference on

Conference_Location

Jerusalem

Print_ISBN

0-8186-6270-0

Type

conf

DOI

10.1109/ICPR.1994.576930

Filename

576930