DocumentCode
290069
Title
Automatic language identification using sub-word models
Author
Tucker, R.C.F. ; Carey, M.J. ; Parris, E.S.
Author_Institution
Ensigma Ltd., Chepstow, UK
Volume
i
fYear
1994
fDate
19-22 Apr 1994
Abstract
The paper describes initial experiments on automatic language identification with the particular aim of discriminating languages in the same language group. Subword models were built from the English, Dutch and Norwegian sections of the EUROM1 database using fully automatic segmentation based on TIMIT-derived models. Three techniques were then examined. In the first technique only acoustic differences between the phonemes of each language were used. The second technique relied on the relative frequencies of the phonemes of each language, while the third technique combined the two sources of information. The latter technique proved the best giving 97% accuracy for English vs. Dutch, and 90% across the three languages
Keywords
hidden Markov models; identification; natural languages; speech recognition; Dutch; EUROM1 database; English; Norwegian; TIMIT-derived models; acoustic differences; automatic language identification; automatic segmentation; phonemes; relative frequencies; subword models; Databases; Frequency; Hidden Markov models; Loudspeakers; Natural languages; Speech recognition; Testing; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 1994. ICASSP-94., 1994 IEEE International Conference on
Conference_Location
Adelaide, SA
ISSN
1520-6149
Print_ISBN
0-7803-1775-0
Type
conf
DOI
10.1109/ICASSP.1994.389295
Filename
389295
Link To Document