Title :
Adaptive pitch-based speech detection for hands-free applications
Author :
Abu-El-Quran, A.R. ; Goubran, R.A.
Author_Institution :
Dept. of Syst. & Comput. Eng., Carleton Univ., Ottawa, Ont., Canada
Abstract :
This paper proposes a new algorithm for classifying an audio segment as speech or non-speech. The proposed algorithm is capable of handling reverberation and low signal-to-noise environments; therefore, it is suitable for hands-free applications. The algorithm divides an audio segment into frames, estimates the presence of pitch in each frame, and calculates a pitch ratio parameter. This parameter is then used to classify the audio segment. The threshold used in calculating this parameter is adapted to accommodate different environments. The performance of the proposed algorithm is evaluated for different signal-to-noise ratios and different segment sizes using a library of audio segments. The library includes speech segments and nonspeech segments such as fan noise and cocktail noise. Using 0.4 second segments it is shown that the proposed algorithm can achieve a correct decision for 95.7% of the speech segments and 96.7% of the nonspeech segments under reverberant conditions.
Keywords :
adaptive signal processing; reverberation; signal classification; speech processing; 0.4 s; adaptive pitch ratio algorithm; adaptive pitch-based speech detection; audio segment classification; cocktail noise; fan noise; hands-free applications; low signal-to-noise environments; nonspeech segments; pitch ratio parameter; reverberant conditions; reverberation; speech segments; Acoustic noise; Application software; Cameras; Libraries; Microphone arrays; Noise cancellation; Reverberation; Signal to noise ratio; Speech enhancement; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2005. Proceedings. (ICASSP '05). IEEE International Conference on
Print_ISBN :
0-7803-8874-7
DOI :
10.1109/ICASSP.2005.1415707