DocumentCode :
3131937
Title :
Recognition rate estimation based on word alignment network and discriminative error type classification
Author :
Ogawa, Anna ; Hori, Toshikazu ; Nakamura, A.
Author_Institution :
NTT Commun. Sci. Labs., NTT Corp., Kyoto, Japan
fYear :
2012
fDate :
2-5 Dec. 2012
Firstpage :
113
Lastpage :
118
Abstract :
Techniques for estimating recognition rates without using reference transcriptions are essential if we are to judge whether or not speech recognition technology is applicable to a new task. This paper proposes two recognition rate estimation methods for continuous speech recognition. The first is an easy-to-use method based on a word alignment network (WAN) obtained from a word confusion network through simple conversion procedures. A WAN contains the correct (C), substitution error (S), insertion error (I) and deletion error (D) probabilities word-by-word for a recognition result. By summing these CSID probabilities individually, the percent correct and word accuracy (WACC) can be estimated without using a reference transcription. The second more advanced method refines the CSID probabilities provided by a WAN based on discriminative error type classification (ETC) and estimates the recognition rates more accurately. In the experiments on the MIT lecture speech corpus, we obtained 0.97 of correlation coefficient between the true WACCs calculated by a scoring tool using reference transcriptions and the WACCs estimated from the discriminative ETC results.
Keywords :
error statistics; probability; speech recognition; word processing; CSID probabilities; ETC; MIT lecture speech corpus; WACC; WAN; continuous speech recognition technology; conversion procedures; correct and word accuracy; correlation coefficient; deletion error; discriminative error type classification; easy-to-use method; insertion error; recognition rate estimation methods; recognition rate estimation technique; reference transcriptions; scoring tool; substitution error; word alignment network; word confusion network; word-by-word probabilities; Error probability; Estimation; Feature extraction; Speech; Speech recognition; Training; Wide area networks; Speech recognition; discriminative error type classification; recognition rate estimation; word alignment network;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language Technology Workshop (SLT), 2012 IEEE
Conference_Location :
Miami, FL
Print_ISBN :
978-1-4673-5125-6
Electronic_ISBN :
978-1-4673-5124-9
Type :
conf
DOI :
10.1109/SLT.2012.6424207
Filename :
6424207
Link To Document :
بازگشت