Title :
Speaker-trained recognition using allophonic enrollment models
Author :
Yanhoucke, V. ; Hochberg, M.M. ; Leggetter, C.J.
Author_Institution :
Dept. of Electr. Eng., Stanford Univ., CA, USA
Abstract :
We introduce a method for performing speaker-trained recognition based on context-dependent allophone models from a large-vocabulary, speaker-independent recognition system. A set of speaker-enrollment templates is selected from the context-dependent allophone models. These templates are used to build representations of the speaker-enrolled utterances. The advantages of this approach include improved performance and portability of the enrollments across different acoustic models. We describe the approach used to select the enrollment templates and how to apply them to speaker-trained recognition. The approach has been evaluated on an over-the-telephone, voice-activated dialing task and shows significant performance improvements over techniques based on context-independent phone models or general acoustic model templates. In addition, the portability of enrollments from one model set to another is shown to result in almost no performance degradation.
Keywords :
learning (artificial intelligence); speech recognition; speech-based user interfaces; acoustic models; allophonic enrollment models; context-dependent models; speaker-enrollment templates; speaker-trained recognition; speech recognition; voice-activated dialing; Acoustics; Context modeling; Data mining; Databases; Degradation; Engines; Natural languages; Speech recognition; Testing; Vocabulary;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2001. ASRU '01. IEEE Workshop on
Print_ISBN :
0-7803-7343-X
DOI :
10.1109/ASRU.2001.1034589