• DocumentCode
    290069
  • Title

    Automatic language identification using sub-word models

  • Author

    Tucker, R.C.F. ; Carey, M.J. ; Parris, E.S.

  • Author_Institution
    Ensigma Ltd., Chepstow, UK
  • Volume
    i
  • fYear
    1994
  • fDate
    19-22 Apr 1994
  • Abstract
    The paper describes initial experiments on automatic language identification with the particular aim of discriminating languages in the same language group. Subword models were built from the English, Dutch and Norwegian sections of the EUROM1 database using fully automatic segmentation based on TIMIT-derived models. Three techniques were then examined. In the first technique only acoustic differences between the phonemes of each language were used. The second technique relied on the relative frequencies of the phonemes of each language, while the third technique combined the two sources of information. The latter technique proved the best giving 97% accuracy for English vs. Dutch, and 90% across the three languages
  • Keywords
    hidden Markov models; identification; natural languages; speech recognition; Dutch; EUROM1 database; English; Norwegian; TIMIT-derived models; acoustic differences; automatic language identification; automatic segmentation; phonemes; relative frequencies; subword models; Databases; Frequency; Hidden Markov models; Loudspeakers; Natural languages; Speech recognition; Testing; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
  • Conference_Location
    Adelaide, SA
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-1775-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1994.389295
  • Filename
    389295