• DocumentCode
    774722
  • Title

    Adaptive categorical understanding for spoken dialogue systems

  • Author

    Potamianos, Alexandros ; Narayanan, Shrikanth ; Riccardi, Giuseppe

  • Author_Institution
    Dept. of Electron. & Comput. Eng., Tech. Univ. of Crete, Chania, Greece
  • Volume
    13
  • Issue
    3
  • fYear
    2005
  • fDate
    5/1/2005 12:00:00 AM
  • Firstpage
    321
  • Lastpage
    329
  • Abstract
    In this paper, the speech understanding problem in the context of a spoken dialogue system is formalized in a maximum likelihood framework. Off-line adaptation of stochastic language models that interpolate dialogue state specific and general application-level language models is proposed. Word and dialogue-state n-grams are used for building categorical understanding and dialogue models, respectively. Acoustic confidence scores are incorporated in the understanding formulation. Problems due to data sparseness and out-of-vocabulary words are discussed. The performance of the speech recognition and understanding language models are evaluated with the "Carmen Sandiego" multimodal computer game corpus. Incorporating dialogue models reduces relative understanding error rate by 15%-25%, while acoustic confidence scores achieve a further 10% error reduction for this computer gaming application.
  • Keywords
    computer games; interactive systems; maximum likelihood estimation; natural languages; speech recognition; acoustic confidence score; adaptive categorical understanding; computer gaming application; data sparseness; dialogue state language model; error rate; error reduction; maximum likelihood framework; multimodal computer game corpus; out-of-vocabulary word; speech recognition; speech understanding problem; spoken dialogue system; Acoustic applications; Adaptation model; Computer applications; Computer errors; Error analysis; Maximum likelihood decoding; Natural languages; Routing; Speech recognition; Stochastic processes; Acoustic confidence scores; dialogue modeling; language model adaptation; n-gram models; natural language processing; speech understanding;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/TSA.2005.845836
  • Filename
    1420367