Title :
Automatic transcription of voicemail at AT&T
Author :
Bacchiani, Michiel
Author_Institution :
AT&T Labs-Research, Florham Park, NJ, USA
Abstract :
Reports on the automatic transcription accuracy of voicemail messages. It shows that vocal tract length normalization and adaptation using linear transformations, proven to improve accuracy on the Switchboard task, provide similar accuracy improvements on this task. Direct application of the normalization techniques is complicated by the fragmentation of the data. However, unsupervised clustering was found to be effective in ensuring robust estimation of normalization parameters. Variance adaptation resulted in larger accuracy improvements than adaptation of only mean parameters, probably due to a large variability in channel conditions. The use of semi-tied covariances provides additional gains over using speaker and channel normalization. The combined gain of using various compensation techniques improves the system word error rate from 34.9% for the baseline system to 28.7%
Keywords :
compensation; covariance matrices; parameter estimation; speech recognition; voice mail; AT&T; automatic transcription; channel conditions; compensation techniques; linear transformations; robust estimation; semi-tied covariances; transcription accuracy; unsupervised clustering; variance adaptation; vocal tract length normalization; voicemail; Clustering algorithms; Decorrelation; Error analysis; Loudspeakers; Maximum likelihood linear regression; Parameter estimation; Postal services; Robustness; Speech analysis; Voice mail;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
Print_ISBN :
0-7803-7041-4
DOI :
10.1109/ICASSP.2001.940758