DocumentCode
381278
Title
Investigating stochastic speech understanding
Author
Bonneau-Maynard, Héléne ; Lefevre, Francois
Author_Institution
Lab. d´´Informatique pour la Mecanique et les Sci. de l´´Ingenieur, CNRS, Orsay, France
fYear
2001
fDate
2001
Firstpage
260
Lastpage
263
Abstract
The need for human expertise in the development of a speech understanding system can be greatly reduced by the use of stochastic techniques. However corpus-based techniques require the annotation of large amounts of training data. Manual semantic annotation of such corpora is tedious, expensive, and subject to inconsistencies. This work investigates the influence of the training corpus size on the performance of the understanding module. The use of automatically annotated data is also investigated as a means to increase the corpus size at a very low cost. First, a stochastic speech understanding model developed using data collected with the LIMSI ARISE dialog system is presented. Its performance is shown to be comparable to that of the rule-based caseframe grammar currently used in the system. In a second step, two ways of reducing the development cost are pursued: (1) reducing of the amount of manually annotated data used to train the stochastic models and (2) using automatically annotated data in the training process.
Keywords
interactive systems; natural language interfaces; speech recognition; speech-based user interfaces; stochastic processes; LIMSI ARISE dialog system; automatically annotated data; development cost reduction; performance; speech understanding system; training corpus size; Costs; Data mining; Humans; Natural languages; Performance evaluation; Speech analysis; Stochastic processes; Stochastic systems; Telephony; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
Print_ISBN
0-7803-7343-X
Type
conf
DOI
10.1109/ASRU.2001.1034637
Filename
1034637
Link To Document