DocumentCode
774722
Title
Adaptive categorical understanding for spoken dialogue systems
Author
Potamianos, Alexandros ; Narayanan, Shrikanth ; Riccardi, Giuseppe
Author_Institution
Dept. of Electron. & Comput. Eng., Tech. Univ. of Crete, Chania, Greece
Volume
13
Issue
3
fYear
2005
fDate
5/1/2005 12:00:00 AM
Firstpage
321
Lastpage
329
Abstract
In this paper, the speech understanding problem in the context of a spoken dialogue system is formalized in a maximum likelihood framework. Off-line adaptation of stochastic language models that interpolate dialogue state specific and general application-level language models is proposed. Word and dialogue-state n-grams are used for building categorical understanding and dialogue models, respectively. Acoustic confidence scores are incorporated in the understanding formulation. Problems due to data sparseness and out-of-vocabulary words are discussed. The performance of the speech recognition and understanding language models are evaluated with the "Carmen Sandiego" multimodal computer game corpus. Incorporating dialogue models reduces relative understanding error rate by 15%-25%, while acoustic confidence scores achieve a further 10% error reduction for this computer gaming application.
Keywords
computer games; interactive systems; maximum likelihood estimation; natural languages; speech recognition; acoustic confidence score; adaptive categorical understanding; computer gaming application; data sparseness; dialogue state language model; error rate; error reduction; maximum likelihood framework; multimodal computer game corpus; out-of-vocabulary word; speech recognition; speech understanding problem; spoken dialogue system; Acoustic applications; Adaptation model; Computer applications; Computer errors; Error analysis; Maximum likelihood decoding; Natural languages; Routing; Speech recognition; Stochastic processes; Acoustic confidence scores; dialogue modeling; language model adaptation; n-gram models; natural language processing; speech understanding;
fLanguage
English
Journal_Title
Speech and Audio Processing, IEEE Transactions on
Publisher
ieee
ISSN
1063-6676
Type
jour
DOI
10.1109/TSA.2005.845836
Filename
1420367
Link To Document