DocumentCode :
312003
Title :
A category based approach for recognition of out-of-vocabulary words
Author :
Gallwitz, F. ; Nöth, E. ; Niemann, H.
Author_Institution :
Lehrstuhl fur Mustererkennung, Erlangen-Nurnberg Univ., Germany
Volume :
1
fYear :
1996
fDate :
3-6 Oct 1996
Firstpage :
228
Abstract :
In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a significant amount of out-of-vocabulary words even when the vocabulary size is very large. We present a new approach for the integration of out-of-vocabulary words into statistical language models. We use category information for all words in the training corpus to define a function that gives an approximation of the out-of-vocabulary word emission probability for each word category. This information is integrated into the language models. Although we use a simple acoustic model for out-of-vocabulary words, we achieve a 6% reduction of word error rate on spontaneous speech data with about 5% out-of-vocabulary rate
Keywords :
computational linguistics; natural language interfaces; probability; speech recognition; statistical analysis; vocabulary; acoustic model; automatic speech recognition; category based approach; out-of-vocabulary word recognition; spontaneous speech data; spontaneous speech tasks; statistical language models; training corpus; vocabulary; word emission probability; word error rate; Acoustic applications; Acoustic emission; Automatic speech recognition; Context modeling; Information retrieval; Predictive models; Probability; Speech recognition; Telephony; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
Type :
conf
DOI :
10.1109/ICSLP.1996.607083
Filename :
607083
Link To Document :
بازگشت