Title :
A category based approach for recognition of out-of-vocabulary words
Author :
Gallwitz, F. ; Nöth, E. ; Niemann, H.
Author_Institution :
Lehrstuhl fur Mustererkennung, Erlangen-Nurnberg Univ., Germany
Abstract :
In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a significant amount of out-of-vocabulary words even when the vocabulary size is very large. We present a new approach for the integration of out-of-vocabulary words into statistical language models. We use category information for all words in the training corpus to define a function that gives an approximation of the out-of-vocabulary word emission probability for each word category. This information is integrated into the language models. Although we use a simple acoustic model for out-of-vocabulary words, we achieve a 6% reduction of word error rate on spontaneous speech data with about 5% out-of-vocabulary rate
Keywords :
computational linguistics; natural language interfaces; probability; speech recognition; statistical analysis; vocabulary; acoustic model; automatic speech recognition; category based approach; out-of-vocabulary word recognition; spontaneous speech data; spontaneous speech tasks; statistical language models; training corpus; vocabulary; word emission probability; word error rate; Acoustic applications; Acoustic emission; Automatic speech recognition; Context modeling; Information retrieval; Predictive models; Probability; Speech recognition; Telephony; Vocabulary;
Conference_Titel :
Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
0-7803-3555-4
DOI :
10.1109/ICSLP.1996.607083