Title :
Evaluating vad for automatic speech recognition
Author :
Sibo Tong ; Nanxin Chen ; Yanmin Qian ; Kai Yu
Author_Institution :
Dept. of Comput. Sci. & Eng., Shanghai Jiao Tong Univ., Shanghai, China
Abstract :
Voice activity detection (VAD) plays a crucial role in speech processing, especially in automatic speech recognition (ASR). It identifies the boundaries of the speech to be recognized and the boundary accuracies may significantly affect the recognition performance. Conventional VAD evaluation criteria are mostly based on frame-level accuracy of speech/non-speech classification, which may result in weak correlation between VAD and ASR performance. Even though some VAD evaluation criteria consider boundary effects, there has not been an effective overall criterion suitable for evaluating the effect of VAD on ASR. This paper proposes an integrated VAD evaluation criterion taking into account various boundary effects. Experiments on an English Switchboard task showed that, conventional frame accuracy based VAD criterion has weak and unstable correlation with word error rate while the proposed overall criterion is much more stably correlated to word error rate.
Keywords :
signal classification; speech recognition; ASR; English switchboard task; VAD; automatic speech recognition; boundary effects; speech-nonspeech classification; voice activity detection; Accuracy; Automatic speech recognition; Correlation; Error analysis; Measurement; Speech; evaluation metric; speech recognition; voice activity detection;
Conference_Titel :
Signal Processing (ICSP), 2014 12th International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4799-2188-1
DOI :
10.1109/ICOSP.2014.7015406