DocumentCode
417276
Title
Fusion based speech segmentation in DARPA SPINE2 task
Author
Zheng, Chengyi ; Yan, Yonghong
Author_Institution
Comput. Sci. & Eng. Dept., Oregon Health & Sci. Univ., Beaverton, OR, USA
Volume
1
fYear
2004
fDate
17-21 May 2004
Abstract
We report a new fusion based segmentation approach using multiple filter bank coefficients. This approach takes advantage of current feature extraction procedure, with little additional computation cost. Another level of fusion was performed by combining several segmentation systems. Evaluation was conducted on the second Speech In Noisy Environments (SPINE2) task. Experiments show our fusion based approaches significantly reduced the WER compared to two classifier-based approaches. Compared to the manual segmentation, our approach only has 0.3% WER increase.
Keywords
channel bank filters; error statistics; feature extraction; sensor fusion; speech recognition; ASR; DARPA SPINE2 task; Speech In Noisy Environments task; WER reduction; automatic speech recognition; feature extraction; fusion based segmentation; multiple filter bank coefficients; speech segmentation; Acoustic noise; Automatic speech recognition; Computer science; Decoding; Hidden Markov models; Loudspeakers; Military aircraft; Speech recognition; Streaming media; Working environment noise;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
ISSN
1520-6149
Print_ISBN
0-7803-8484-9
Type
conf
DOI
10.1109/ICASSP.2004.1326128
Filename
1326128
Link To Document