Title :
Fusion based speech segmentation in DARPA SPINE2 task
Author :
Zheng, Chengyi ; Yan, Yonghong
Author_Institution :
Comput. Sci. & Eng. Dept., Oregon Health & Sci. Univ., Beaverton, OR, USA
Abstract :
We report a new fusion based segmentation approach using multiple filter bank coefficients. This approach takes advantage of current feature extraction procedure, with little additional computation cost. Another level of fusion was performed by combining several segmentation systems. Evaluation was conducted on the second Speech In Noisy Environments (SPINE2) task. Experiments show our fusion based approaches significantly reduced the WER compared to two classifier-based approaches. Compared to the manual segmentation, our approach only has 0.3% WER increase.
Keywords :
channel bank filters; error statistics; feature extraction; sensor fusion; speech recognition; ASR; DARPA SPINE2 task; Speech In Noisy Environments task; WER reduction; automatic speech recognition; feature extraction; fusion based segmentation; multiple filter bank coefficients; speech segmentation; Acoustic noise; Automatic speech recognition; Computer science; Decoding; Hidden Markov models; Loudspeakers; Military aircraft; Speech recognition; Streaming media; Working environment noise;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
Print_ISBN :
0-7803-8484-9
DOI :
10.1109/ICASSP.2004.1326128