• DocumentCode
    3245501
  • Title

    Improving the robustness of prosody dependent language modeling based on prosody syntax dependence

  • Author

    Chen, Ken ; Hasegawa-Johnson, Mark

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Illinois Univ., Urbana, IL, USA
  • fYear
    2003
  • fDate
    30 Nov.-3 Dec. 2003
  • Firstpage
    435
  • Lastpage
    440
  • Abstract
    The paper presents a novel approach that improves the robustness of prosody dependent language modeling by leveraging the dependence between prosody and syntax. A prosody dependent language model describes the joint probability distribution of concurrent word and prosody sequences and can be used to provide prior language constraints in a prosody dependent speech recognizer. Robust maximum likelihood (ML) estimation of prosody dependent n-gram language models requires a large amount of prosodically transcribed data. We show that prosody-syntax dependence can be utilized to diminish the data sparseness introduced by prosody dependent modeling. Experiments on a radio news corpus show that the prosody dependent language model estimated using our approach reduces the joint perplexity by up to 34% as compared with the standard ML-estimated prosody dependent language model; the word perplexity can be reduced by up to 84% as compared with the standard ML-estimated prosody independent language model. In recognition experiments, the language model estimated by our approach create an improvement of 1% in word recognition accuracy, 0.7% in accent recognition accuracy and 1.5% in intonational phrase boundary (IPB) recognition accuracy over a baseline prosody dependent language model.
  • Keywords
    linguistics; maximum likelihood estimation; natural languages; probability; sequences; speech recognition; accent recognition; data sparseness; intonational phrase boundary recognition; maximum likelihood estimation; prosody dependent language modeling; prosody dependent speech recognizer; prosody syntax dependence; radio news corpus; word perplexity; word recognition; Aging; Automatic speech recognition; Maximum likelihood detection; Maximum likelihood estimation; Natural languages; Probability distribution; Rhythm; Robustness; Speech recognition; Text recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
  • Print_ISBN
    0-7803-7980-2
  • Type

    conf

  • DOI
    10.1109/ASRU.2003.1318480
  • Filename
    1318480