Fix it where it fails: Pronunciation learning by mining error corrections from speech logs

Author

Zhenzhen Kou ; Stanton, Daisy ; Fuchun Peng ; Beaufays, Francoise ; Strohman, Trevor

Author_Institution

Google Inc., Mountain View, CA, USA

fYear

2015

fDate

19-24 April 2015

Firstpage

4619

Lastpage

4623

Abstract

The pronunciation dictionary, or lexicon, is an essential component in an automatic speech recognition (ASR) system in that incorrect pronunciations cause systematic misrecognitions. It typically consists of a list of word-pronunciation pairs written by linguists, and a grapheme-to-phoneme (G2P) engine to generate pronunciations for words not in the list. The hand-generated list can never keep pace with the growing vocabulary of a live speech recognition system, and the G2P is usually of limited accuracy. This is especially true for proper names whose pronunciations may be influenced by various historical or foreign-origin factors. In this paper, we propose a language-independent approach to detect misrecognitions and their corrections from voice search logs. We learn previously unknown pronunciations from this data, and demonstrate that they significantly improve the quality of a production-quality speech recognition system.

Keywords

linguistics; speech recognition; ASR system; G2P engine; automatic speech recognition; foreign-origin factors; grapheme-phoneme engine; hand-generated list; language-independent approach; lexicon; limited accuracy; linguists; live speech recognition system; mining error correction; production-quality speech recognition system; pronunciation dictionary; pronunciation learning; speech logs; systematic misrecognitions; voice search logs; word-pronunciation pairs; Acoustics; Data mining; Engines; Keyboards; Motion pictures; Speech; Speech recognition; data extraction; logistic regression; pronunciation learning; speech recognition;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178846

Filename

7178846