Title :
Towards a large-vocabulary French vocal dictation based on a size-independent language-model search using the INRS recognizer
Author :
Tolba, Hesham ; Shaughnessy, Douglas O.
Author_Institution :
INRS-Telecommun., Verdun, Que., Canada
Abstract :
Reports the progress of the large-vocabulary French-speech vocal dictation studies at INRS-Te´le´com. To evaluate such progress, the hidden Markov model (HMM) based recognizer of INRS is used. This recognizer, which represents each phone using HMMs, uses context-dependent phone modeling and n-gram statistics in order to cope with both coarticulation and phonological phenomena, respectively. A series of experiments on speaker-independent continuous-speech recognition have been carried out using a subset of the large read-speech French-language corpus, BREF, containing recordings of texts selected from the French newspaper Le Monde. We show through experiments that using a lexical graph that ignores the language model states and homophone distinctions and postponing the application of such knowledge to a post-processor simplifies the recognition process while keeping its high accuracy. The word recognition rate, using gender-dependent vector quantization (VQ) models, a 20,000-word pronunciation variants-based lexicon and a bigram model estimated using Le Monde text data, was found to be 91.62% for males and 90.98% for females
Keywords :
dictation; gender issues; hidden Markov models; natural languages; nomograms; search problems; speech recognition; vector quantisation; vocabulary; BREF corpus; INRS speech recognizer; INRS-Telecom; Le Monde newspaper; accuracy; bigram model estimation; coarticulation; context-dependent phone modeling; females; gender-dependent vector quantization models; hidden Markov model; homophone distinctions; language model states; large-vocabulary French vocal dictation; lexical graph; males; n-gram statistics; phone representation; phonological phenomena; post-processor; pronunciation variants-based lexicon; read-speech French-language corpus; size-independent language-model search; speaker-independent continuous-speech recognition; text data; text recordings; word recognition rate; Automatic speech recognition; Business; Context modeling; Hidden Markov models; Natural languages; Speech enhancement; Speech recognition; Statistics; Text recognition; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
Conference_Location :
Istanbul
Print_ISBN :
0-7803-6293-4
DOI :
10.1109/ICASSP.2000.862066