DocumentCode
2330763
Title
Improved named entity extraction from conversational speech with language model adaptation
Author
Siu, Man-Hung ; Vessenes, Ted ; Bulyko, Ivan ; Kimball, Owen
Author_Institution
Raytheon BBN Technol., Cambridge, MA, USA
fYear
2010
fDate
12-15 Dec. 2010
Firstpage
418
Lastpage
423
Abstract
Named entity (NE) extraction is traditionally applied to written text. Some recent works extend the extraction to broadcast speech but most of these approaches simply cascade a speech-to-text (STT) engine and a named entity tagger without any information sharing between the two. In this paper, we extract named entities from conversational speech and explore approaches to couple the STT and NE extraction beyond a simple cascade. We propose a new approach that adapts the STT language model based on the extracted named entities. This steers the STT to focus on the subset of the vocabulary that is most important to NE extraction. We performed a number of experiments on English conversational speech in the Fisher Corpus with different STT recognition speeds. We show that the language model (LM) adaptation approach increases NE extraction recall and improves NE performance by as much as 15% as measured by F-score, by significantly improving the word error rate (WER) of NE words, with minimal impact on the overall WER.
Keywords
speech synthesis; English conversational speech; broadcast speech extraction; language model adaptation; language model adaptation approach; named entity extraction; speech-to-text engine; Named entity extraction; adaptation; language model;
fLanguage
English
Publisher
ieee
Conference_Titel
Spoken Language Technology Workshop (SLT), 2010 IEEE
Conference_Location
Berkeley, CA
Print_ISBN
978-1-4244-7904-7
Electronic_ISBN
978-1-4244-7902-3
Type
conf
DOI
10.1109/SLT.2010.5700889
Filename
5700889
Link To Document