• DocumentCode
    1686443
  • Title

    Language model capitalization

  • Author

    Beaufays, Francoise ; Strope, Brian

  • Author_Institution
    Google, Mountain View, CA, USA
  • fYear
    2013
  • Firstpage
    6749
  • Lastpage
    6752
  • Abstract
    In many speech recognition systems, capitalization is not an inherent component of the language model: training corpora are down cased, and counts are accumulated for sequences of lower-cased words. This level of modeling is sufficient for automating voice commands or otherwise enabling users to communicate with a machine, but when the recognized speech is intended to be read by a person, such as in email dictation or even some web search applications, the lack of capitalization of the user´s input can add an extra cognitive load on the reader. For these cases, speech recognition systems often post-process the recognized text to restore capitalization. We propose folding capitalization directly in the recognition language model. Instead of post-processing, we take the approach that language should be represented in all its richness, with capitalization, diacritics, and other special symbols. With that perspective, we describe a strategy to handle poorly capitalized or uncapitalized training corpora for language modeling. The resulting recognition system retains the accuracy/latency/memory tradeoff of our uncapitalized production recognizer, while providing properly cased outputs.
  • Keywords
    speech recognition; text detection; automating voice commands; language model capitalization; lower-cased words; speech recognition systems; training corpora; Accuracy; Data models; Error analysis; Speech; Speech recognition; Training; Training data; Capitalization; FST; language modeling;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6638968
  • Filename
    6638968