مرکز منطقه ای اطلاع رساني علوم و فناوري - Speech recognition with very large size dictionary

Abstract :

This paper proposes a new strategy, the Multi-Level Decoding (MLD), that allows to use a Very Large Size Dictionary (VLSD, size more than 100,000 words) in speech recognition. MLD proceeds in three steps: $\\bullet$ a Syllable Match procedure uses an acoustic model to build a list of the most probable syllables that match the acoustic signal from a given time frame. $\\bullet$ from this list, a Word Match procedure uses the dictionary to build partial word hypothesis. $\\bullet$ then a Sentence Match procedure uses a probabilistic language model to build partial sentence hypothesis until total sentences are found. An original matching algorithm is proposed for the Syllable Match procedure. This strategy is experimented on a dictation task of French texts. Two different dictionaries are tested, $\\bullet$ one composed of the 10,000 most frequent words, $\\bullet$ the other composed of 200,000 words. The recognition results are given and compared. The error rate on words with 10,000 words is 17.3%. If the errors due to the lack of coverage are not counted, the error rate with 10,000 words is reduced to 10.6%. The error rate with 200,000 words is 12.7%.