• DocumentCode
    3165629
  • Title

    Grapheme-to-phoneme model generation for Indo-European languages

  • Author

    Schlippe, Tim ; Ochs, Sebastian ; Schultz, Tanja

  • Author_Institution
    Cognitive Syst. Lab., Karlsruhe Inst. of Technol. (KIT), Karlsruhe, Germany
  • fYear
    2012
  • fDate
    25-30 March 2012
  • Firstpage
    4801
  • Lastpage
    4804
  • Abstract
    In this paper, we evaluate grapheme-to-phoneme (g2p) models among languages and of different quality. We created g2p models for Indo-European languages with word-pronunciation pairs from the GlobalPhone project and from Wiktionary [1]. Then we checked their quality in terms of consistency and complexity as well as their impact on Czech, English, French, Spanish, Polish, and German ASR. While the GlobalPhone dictionaries were manually cross-checked and have been used successfully in LVCSR, Wiktionary pronunciations have been provided by the Internet community and can be used to rapidely and economically create pronunciation dictionaries for new languages and domains.
  • Keywords
    Internet; computational complexity; dictionaries; natural language processing; speech recognition; Czech ASR; English ASR; French ASR; German ASR; GlobalPhone dictionaries; GlobalPhone project; Indo-European languages; Internet community; LVCSR; Polish ASR; Spanish ASR; Wiktionary pronunciations; automatic speech recognition; complexity check; consistency check; g2p model; grapheme-to-phoneme model generation; multilingual speech recognition; word-pronunciation pairs; Complexity theory; Computational modeling; Data models; Dictionaries; Error analysis; Training; Training data; multilingual speech recognition; pronunciation modeling; web-derived pronunciations;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
  • Conference_Location
    Kyoto
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4673-0045-2
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2012.6288993
  • Filename
    6288993