DocumentCode
730806
Title
Automatic pronunciation verification for speech recognition
Author
Rao, Kanishka ; Fuchun Peng ; Beaufays, Francoise
Author_Institution
Google Inc., Mountain View, CA, USA
fYear
2015
fDate
19-24 April 2015
Firstpage
5162
Lastpage
5166
Abstract
Pronunciations for words are a critical component in an automated speech recognition system (ASR) as mis-recognitions may be caused by missing or inaccurate pronunciations. The need for high quality pronunciations has recently motivated data-driven techniques to generate them [1]. We propose a data-driven and language-independent framework for verification of such pronunciations to further improve the lexicon quality in ASR. New candidate pronunciations are verified by re-recognizing historical audio logs and examining the associated recognition costs. We build an additional pronunciation quality feature from word and pronunciation frequencies in logs. A machine learned classifier trained on these features achieves nearly 90% accuracy in labeling good vs bad pronunciations across all languages we tested. New pronunciations verified as good may be added to a dictionary, while bad pronunciations may be discarded or sent to experts for further evaluation. We simultaneously verify 5,000 to 30,000 new pronunciations within a few hours and show improvements in the ASR performance as a result of including pronunciations verified by this system.
Keywords
feature extraction; speech recognition; ASR; automated speech recognition system; automatic pronunciation verification; data-driven techniques; lexicon quality; machine learned classifier; misrecognitions; pronunciation quality feature; recognizing historical audio logs; Dictionaries; Measurement; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
Conference_Location
South Brisbane, QLD
Type
conf
DOI
10.1109/ICASSP.2015.7178955
Filename
7178955
Link To Document