DocumentCode :
1919557
Title :
An entropy based robust speech boundary detection algorithm for realistic noisy environments
Author :
Weaver, Kim ; Waheed, Khurram ; Salem, Fathi M.
Author_Institution :
Circuits, Syst. & Neural Network Lab., Michigan State Univ., East Lansing, MI, USA
Volume :
1
fYear :
2003
fDate :
20-24 July 2003
Firstpage :
680
Abstract :
This paper addresses the issue of automatic word/sentence boundary detection in both noiseless and noisy backgrounds. We present our proposed speech boundary detection algorithm using a time-domain entropic contrast function. The entropic contrast exhibits well-behaved characteristics as compared to energy-based methods resulting in immunity to endpoint cut-of issues for the latter. This algorithm is capable of estimating the speech boundaries both in noiseless and distinctive noise backgrounds such as a fan, car engine, radio etc. For the case of wide-spectrum colored background noise such as jazz, opera, songs, rock music etc., we further propose a modification in the preprocessing stage by incorporating a frequency-weighting scheme to emphasize the speech contents. This improved scheme provides proper speech segmentation even in the presence of wide-spectral background noise with no change in the computational cost versus our earlier proposed algorithm. A complete time-domain implementation is sought due to its lower computational burden and its suitability for real-time implementations using DSPs, FPGAs, ASICs etc. The algorithm improves the accuracy of word boundary estimates by a factor of at least 25% for the case of isolated (and 16% for connected) speech. For continuous speech, the algorithm can determine sentence boundaries thus allowing for power efficient implementation of speech recognition engines by rejecting extended periods of silence.
Keywords :
entropy; noise; speech recognition; ASIC; DSP; FPGA; continuous speech; energy-based methods; entropy; frequency-weighting scheme; power efficient implementation; real-time implementation; realistic noisy environment; robust speech boundary detection algorithm; sentence boundaries determination; speech content; time-domain entropic contrast function; wide-spectrum colored background noise; word boundary estimation; Background noise; Computational efficiency; Detection algorithms; Engines; Entropy; Frequency; Noise robustness; Speech enhancement; Time domain analysis; Working environment noise;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2003. Proceedings of the International Joint Conference on
ISSN :
1098-7576
Print_ISBN :
0-7803-7898-9
Type :
conf
DOI :
10.1109/IJCNN.2003.1223446
Filename :
1223446
Link To Document :
بازگشت