DocumentCode :
542302
Title :
Text normalization with varied data sources for conversational speech language modeling
Author :
Schwarm, Sarah ; Ostendorf, Mari
Author_Institution :
Dept. of Computer Science, University of Washington, Seattle, 98195. USA
Volume :
1
fYear :
2002
fDate :
13-17 May 2002
Abstract :
Collecting sufficient language model training data for good speech recognition performance in a new domain is often difficult. However, there may be other sources of data that are matched in terms of topic or style, if not both. This paper looks at the use of text normalization tools to make these data more suitable for language model training, in conjunction with mixture models to combine data from different sources. We specifically address the task of recognizing meeting speech, showing a small reduction in word error rate over a baseline language model trained from conversational speech data.
Keywords :
Computational modeling; Electronic mail; Vocabulary;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech, and Signal Processing (ICASSP), 2002 IEEE International Conference on
Conference_Location :
Orlando, FL, USA
ISSN :
1520-6149
Print_ISBN :
0-7803-7402-9
Type :
conf
DOI :
10.1109/ICASSP.2002.5743836
Filename :
5743836
Link To Document :
بازگشت