DocumentCode :
2022245
Title :
Cross-lingual experiments with phone recognition
Author :
Lamel, Lori F. ; Gauvain, Jean-Luc
Author_Institution :
LIMSI-CNRS, Orsay, France
Volume :
2
fYear :
1993
fDate :
27-30 April 1993
Firstpage :
507
Abstract :
Research on speaker-independent continuous phone recognition for both French and English is presented. The phone accuracy is assessed on the BREF corpus for French, and on the Wall Street Journal (WSJ) and TIMIT corpora for English. Cross-language differences concerning language properties are presented. It is found that French is easier to recognize at the phone level (the phone error for BREF is 23.6% vs. 30.1% for WSJ), but harder to recognize at the lexical level due to the larger number of homophones. Experiments with signal analysis indicate that a 4 kHz signal bandwidth is sufficient for French, whereas 8 kHz is needed for English. Phone recognition is a powerful technique for language, sex, and speaker identification. With 2 s of speech, the language can be identified with better than 99% accuracy. Sex-identification for BREF and WSJ is error-free. Speaker identification accuracies of 98.2% on TIMIT (462 speakers) and 99.1% on BREF (57 speakers) were obtained with one utterance per speaker. 100% accuracies were obtained with two utterances per speaker.<>
Keywords :
natural languages; speech recognition; 4 kHz; 8 kHz; BREF corpus; English; French; TIMIT; Wall Street Journal; accuracy; cross-lingual experiments; language identification; language properties; sex identification; signal analysis; speaker identification; speaker-independent continuous phone recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.1993.319353
Filename :
319353
Link To Document :
بازگشت