DocumentCode
1686443
Title
Language model capitalization
Author
Beaufays, Francoise ; Strope, Brian
Author_Institution
Google, Mountain View, CA, USA
fYear
2013
Firstpage
6749
Lastpage
6752
Abstract
In many speech recognition systems, capitalization is not an inherent component of the language model: training corpora are down cased, and counts are accumulated for sequences of lower-cased words. This level of modeling is sufficient for automating voice commands or otherwise enabling users to communicate with a machine, but when the recognized speech is intended to be read by a person, such as in email dictation or even some web search applications, the lack of capitalization of the user´s input can add an extra cognitive load on the reader. For these cases, speech recognition systems often post-process the recognized text to restore capitalization. We propose folding capitalization directly in the recognition language model. Instead of post-processing, we take the approach that language should be represented in all its richness, with capitalization, diacritics, and other special symbols. With that perspective, we describe a strategy to handle poorly capitalized or uncapitalized training corpora for language modeling. The resulting recognition system retains the accuracy/latency/memory tradeoff of our uncapitalized production recognizer, while providing properly cased outputs.
Keywords
speech recognition; text detection; automating voice commands; language model capitalization; lower-cased words; speech recognition systems; training corpora; Accuracy; Data models; Error analysis; Speech; Speech recognition; Training; Training data; Capitalization; FST; language modeling;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location
Vancouver, BC
ISSN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2013.6638968
Filename
6638968
Link To Document