Title :
Using articulatory information for speaker adaptation
Author :
Metze, Florian ; Waibel, Alex
Author_Institution :
Interactive Syst. Labs., Univ. Karlsruhe, Austria
fDate :
30 Nov.-3 Dec. 2003
Abstract :
Articulatory features (AF) have proven beneficial for automatic speech recognition (ASR) in noisy environments, for hyper-articulated speech or in multilingual settings. A stream setup can combine standard sub-phone Gaussian mixture models with feature GMM; the weights assigned to each feature stream such as VOICED or BILABIAL could intuitively be used for adaptation to speaker or text. In this paper, we investigate this stream setup, which allows us to add articulatory information to a baseline CD-HMM recognizer, on a database containing several speakers in a number of recordings of spontaneous speech. Our findings indicate that articulatory features as we use them are not entirely a speaker-dependent property, but when using them for speaker adaptation, we find their performance to be comparable to that of constrained MLLR.
Keywords :
Gaussian distribution; feature extraction; hidden Markov models; speech recognition; ASR; BILABIAL stream; CD-HMM recognizer; VOICED stream; articulatory features; automatic speech recognition; feature GMM; hyper-articulated speech; multilingual settings; noisy environments; performance; speaker adaptation; spontaneous speech database; sub-phone Gaussian mixture models; Acoustic noise; Automatic speech recognition; Detectors; Error analysis; Hidden Markov models; Interactive systems; Loudspeakers; Spatial databases; Speech recognition; Working environment noise;
Conference_Titel :
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN :
0-7803-7980-2
DOI :
10.1109/ASRU.2003.1318475