• DocumentCode
    672383
  • Title

    Acoustic unit discovery and pronunciation generation from a grapheme-based lexicon

  • Author

    Hartmann, W. ; Roy, Anirban ; Lamel, Lori ; Gauvain, Jean-Luc

  • Author_Institution
    Spoken Language Process. Group, LIMSI, Orsay, France
  • fYear
    2013
  • fDate
    8-12 Dec. 2013
  • Firstpage
    380
  • Lastpage
    385
  • Abstract
    We present a framework for discovering acoustic units and generating an associated pronunciation lexicon from an initial grapheme-based recognition system. Our approach consists of two distinct contributions. First, context-dependent grapheme models are clustered using a spectral clustering approach to create a set of phone-like acoustic units. Next, we transform the pronunciation lexicon using a statistical machine translation-based approach. Pronunciation hypotheses generated from a decoding of the training set are used to create a phrase-based translation table. We propose a novel method for scoring the phrase-based rules that significantly improves the output of the transformation process. Results on an English language dataset demonstrate the combined methods provide a 13% relative reduction in word error rate compared to a baseline grapheme-based system. Our approach could potentially be applied to low-resource languages without existing lexicons, such as in the Babel project.
  • Keywords
    acoustic signal processing; language translation; natural language processing; pattern clustering; spectral analysis; speech recognition; statistical analysis; Babel project; English language dataset; context-dependent grapheme models; grapheme-based lexicon; grapheme-based speech recognition system; low-resource languages; phone-like acoustic unit discovery; phrase-based rule scoring; phrase-based translation table; pronunciation hypothesis generation; pronunciation lexicon generation; relative reduction; spectral clustering approach; statistical machine translation-based approach; training set decoding; transformation process output improvement; word error rate; Acoustics; Computational modeling; Context modeling; Dictionaries; Hidden Markov models; Training; Training data; acoustic unit discovery; automatic speech recognition; grapheme-based speech recognition; pronunciation learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on
  • Conference_Location
    Olomouc
  • Type

    conf

  • DOI
    10.1109/ASRU.2013.6707760
  • Filename
    6707760