DocumentCode :
336816
Title :
Named entity tagged language models
Author :
Gotoh, Yoshihiko ; Renals, Steve ; Williams, Gethin
Author_Institution :
Dept. of Comput. Sci., Sheffield Univ., UK
Volume :
1
fYear :
1999
fDate :
15-19 Mar 1999
Firstpage :
513
Abstract :
We introduce named entity (NE) language modelling, a stochastic finite state machine approach to identifying both words and NE categories from a stream of spoken data. We provide an overview of our approach to NE tagged language model (LM) generation together with results of the application of such a LM to the task of out-of-vocabulary (OOV) word reduction in large vocabulary speech recognition. Using the Wall Street Journal and Broadcast News corpora, it is shown that the tagged LM was able to reduce the overall word error rate by 14%, detecting up to 70% of previously OOV words. We also describe an example of the direct tagging of spoken data with NE categories
Keywords :
error statistics; finite state machines; natural languages; speech recognition; stochastic processes; Broadcast News corpus; Wall Street Journal corpus; large vocabulary speech recognition; named entity tagged language models; out-of-vocabulary word reduction; spoken data; stochastic finite state machine; word error rate reduction; Automata; Broadcasting; Computer science; Error analysis; Hidden Markov models; Natural languages; Speech recognition; Stochastic processes; Tagging; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1999. Proceedings., 1999 IEEE International Conference on
Conference_Location :
Phoenix, AZ
ISSN :
1520-6149
Print_ISBN :
0-7803-5041-3
Type :
conf
DOI :
10.1109/ICASSP.1999.758175
Filename :
758175
Link To Document :
بازگشت