• DocumentCode
    60759
  • Title

    Learning Lexicons From Speech Using a Pronunciation Mixture Model

  • Author

    McGraw, Ian ; Badr, Ibrahim ; Glass, James R.

  • Author_Institution
    Electr. Eng. & Comput. Sci. Dept., Massachusetts Inst. of Technol., Cambridge, MA, USA
  • Volume
    21
  • Issue
    2
  • fYear
    2013
  • fDate
    Feb. 2013
  • Firstpage
    357
  • Lastpage
    366
  • Abstract
    In many ways, the lexicon remains the Achilles heel of modern automatic speech recognizers. Unlike stochastic acoustic and language models that learn the values of their parameters from training data, the baseform pronunciations of words in a recognizer´s lexicon are typically specified manually, and do not change, unless they are edited by an expert. Our work presents a novel generative framework that uses speech data to learn stochastic lexicons, thereby taking a step towards alleviating the need for manual intervention and automatically learning high-quality pronunciations for words. We test our model on continuous speech in a weather information domain. In our experiments, we see significant improvements over a manually specified “expert-pronunciation” lexicon. We then analyze variations of the parameter settings used to achieve these gains.
  • Keywords
    learning (artificial intelligence); speech recognition; stochastic processes; high-quality pronunciations; language models; manual intervention; manually specified expert-pronunciation lexicon; modern automatic speech recognizers; parameter settings; pronunciation mixture model; recognizer lexicon; speech data; stochastic acoustic; stochastic lexicon learning; training data; weather information domain; Acoustics; Mathematical model; Speech; Speech processing; Speech recognition; Stochastic processes; Training; Baseform generation; dictionary training with acoustics via EM; pronunciation learning; stochastic lexicon;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2012.2226158
  • Filename
    6338277