Title :
Named entity tagged language models
Author :
Gotoh, Yoshihiko ; Renals, Steve ; Williams, Gethin
Author_Institution :
Dept. of Comput. Sci., Sheffield Univ., UK
Abstract :
We introduce named entity (NE) language modelling, a stochastic finite state machine approach to identifying both words and NE categories from a stream of spoken data. We provide an overview of our approach to NE tagged language model (LM) generation together with results of the application of such a LM to the task of out-of-vocabulary (OOV) word reduction in large vocabulary speech recognition. Using the Wall Street Journal and Broadcast News corpora, it is shown that the tagged LM was able to reduce the overall word error rate by 14%, detecting up to 70% of previously OOV words. We also describe an example of the direct tagging of spoken data with NE categories
Keywords :
error statistics; finite state machines; natural languages; speech recognition; stochastic processes; Broadcast News corpus; Wall Street Journal corpus; large vocabulary speech recognition; named entity tagged language models; out-of-vocabulary word reduction; spoken data; stochastic finite state machine; word error rate reduction; Automata; Broadcasting; Computer science; Error analysis; Hidden Markov models; Natural languages; Speech recognition; Stochastic processes; Tagging; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
Conference_Location :
Phoenix, AZ
Print_ISBN :
0-7803-5041-3
DOI :
10.1109/ICASSP.1999.758175