• DocumentCode
    730722
  • Title

    Fix it where it fails: Pronunciation learning by mining error corrections from speech logs

  • Author

    Zhenzhen Kou ; Stanton, Daisy ; Fuchun Peng ; Beaufays, Francoise ; Strohman, Trevor

  • Author_Institution
    Google Inc., Mountain View, CA, USA
  • fYear
    2015
  • fDate
    19-24 April 2015
  • Firstpage
    4619
  • Lastpage
    4623
  • Abstract
    The pronunciation dictionary, or lexicon, is an essential component in an automatic speech recognition (ASR) system in that incorrect pronunciations cause systematic misrecognitions. It typically consists of a list of word-pronunciation pairs written by linguists, and a grapheme-to-phoneme (G2P) engine to generate pronunciations for words not in the list. The hand-generated list can never keep pace with the growing vocabulary of a live speech recognition system, and the G2P is usually of limited accuracy. This is especially true for proper names whose pronunciations may be influenced by various historical or foreign-origin factors. In this paper, we propose a language-independent approach to detect misrecognitions and their corrections from voice search logs. We learn previously unknown pronunciations from this data, and demonstrate that they significantly improve the quality of a production-quality speech recognition system.
  • Keywords
    linguistics; speech recognition; ASR system; G2P engine; automatic speech recognition; foreign-origin factors; grapheme-phoneme engine; hand-generated list; language-independent approach; lexicon; limited accuracy; linguists; live speech recognition system; mining error correction; production-quality speech recognition system; pronunciation dictionary; pronunciation learning; speech logs; systematic misrecognitions; voice search logs; word-pronunciation pairs; Acoustics; Data mining; Engines; Keyboards; Motion pictures; Speech; Speech recognition; data extraction; logistic regression; pronunciation learning; speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on
  • Conference_Location
    South Brisbane, QLD
  • Type

    conf

  • DOI
    10.1109/ICASSP.2015.7178846
  • Filename
    7178846