• DocumentCode
    2262297
  • Title

    Multi-lingual phoneme recognition exploiting acoustic-phonetic similarities of sounds

  • Author

    Köhler, Joachim

  • Author_Institution
    Siemens AG, Munich, Germany
  • Volume
    4
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    2195
  • Abstract
    The aim of the work is to exploit the acoustic-phonetic similarities between several languages. In recent work cross-language HMM-based phoneme models have been used only for bootstrapping the language-dependent models and the multi-lingual approach has been investigated only on very small speech corpora. The author introduces a statistical distance measure to determine the similarities of sounds. Further, he presents a new technique to model multi-lingual phonemes. The experiments are conducted with the OGI Multi-Language Telephone Speech Corpus for the languages American English, German and Spanish. In the first experiment phoneme recognition rates between 39.0% and 53.9% are achieved using language-dependent models. Using cross-language models yields improvement for some phonemes, but on average a degradation of recognition performance is observed. However, cross-language models speeds up the cross-language transfer and reduce the size of the phoneme inventory of multi-lingual speech recognition systems. Finally, a new method of modelling multi-lingual phonemes, which can be used for a variety of languages, is presented. This technique reduces the number of phoneme-based units in a multi-lingual speech recognition system
  • Keywords
    natural languages; speech recognition; statistical analysis; American English language; German language; OGI Multi-Language Telephone Speech Corpus; Spanish language; acoustic-phonetic sound similarity; cross-language transfer; language-dependent models; multi-lingual phoneme recognition; multi-lingual speech recognition systems; phoneme inventory; recognition performance; statistical distance measure; Acoustic measurements; Costs; Databases; Degradation; Hidden Markov models; Loudspeakers; Natural languages; Robustness; Speech recognition; Telephony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607240
  • Filename
    607240