Title :
A new speaker adaptation technique using very short calibration speech
Author_Institution :
Panasonic Technologies Inc., Santa Barbara, CA, USA
Abstract :
A speaker adaptation technique based on the separation of speech spectra variation sources is developed for improving speaker-independent continuous speech recognition. The variation sources include speaker acoustic characteristics, phonologic characteristics, and contextual dependency of allophones. Statistical methods are formulated to normalize speech spectra based on speaker acoustic characteristics and then adapt mixture Gaussian density phone models based on speaker phonologic characteristics. Adaptation experiments using short calibration speech (5 s/speaker) have shown substantial performance improvement over the baseline recognition system. On a TIMIT test set, where the task vocabulary size is 853 and the test set perplexity is 104, the recognition word accuracy has been improved from 86.9% to 90.6% (28.2% error reduction). On a separate test set which contains an additional variation source of recording channel mismatch and with the test set perplexity of 101, the recognition word accuracy has been improved from 65.4% to 85.5% (58.1% error reduction).<>
Keywords :
adaptive systems; calibration; speech recognition; vocabulary; TIMIT test set; calibration speech; contextual dependency of allophones; mixture Gaussian density phone models; performance; perplexity; phonologic characteristics; recognition word accuracy; speaker acoustic characteristics; speaker adaptation technique; speaker-independent continuous speech recognition;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 1993. ICASSP-93., 1993 IEEE International Conference on
Conference_Location :
Minneapolis, MN, USA
Print_ISBN :
0-7803-7402-9
DOI :
10.1109/ICASSP.1993.319369