DocumentCode :
290069
Title :
Automatic language identification using sub-word models
Author :
Tucker, R.C.F. ; Carey, M.J. ; Parris, E.S.
Author_Institution :
Ensigma Ltd., Chepstow, UK
Volume :
i
fYear :
1994
fDate :
19-22 Apr 1994
Abstract :
The paper describes initial experiments on automatic language identification with the particular aim of discriminating languages in the same language group. Subword models were built from the English, Dutch and Norwegian sections of the EUROM1 database using fully automatic segmentation based on TIMIT-derived models. Three techniques were then examined. In the first technique only acoustic differences between the phonemes of each language were used. The second technique relied on the relative frequencies of the phonemes of each language, while the third technique combined the two sources of information. The latter technique proved the best giving 97% accuracy for English vs. Dutch, and 90% across the three languages
Keywords :
hidden Markov models; identification; natural languages; speech recognition; Dutch; EUROM1 database; English; Norwegian; TIMIT-derived models; acoustic differences; automatic language identification; automatic segmentation; phonemes; relative frequencies; subword models; Databases; Frequency; Hidden Markov models; Loudspeakers; Natural languages; Speech recognition; Testing; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location :
Adelaide, SA
ISSN :
1520-6149
Print_ISBN :
0-7803-1775-0
Type :
conf
DOI :
10.1109/ICASSP.1994.389295
Filename :
389295
Link To Document :
بازگشت