DocumentCode
1749604
Title
Automatic transcription of voicemail at AT&T
Author
Bacchiani, Michiel
Author_Institution
AT&T Labs-Research, Florham Park, NJ, USA
Volume
1
fYear
2001
fDate
2001
Firstpage
25
Abstract
Reports on the automatic transcription accuracy of voicemail messages. It shows that vocal tract length normalization and adaptation using linear transformations, proven to improve accuracy on the Switchboard task, provide similar accuracy improvements on this task. Direct application of the normalization techniques is complicated by the fragmentation of the data. However, unsupervised clustering was found to be effective in ensuring robust estimation of normalization parameters. Variance adaptation resulted in larger accuracy improvements than adaptation of only mean parameters, probably due to a large variability in channel conditions. The use of semi-tied covariances provides additional gains over using speaker and channel normalization. The combined gain of using various compensation techniques improves the system word error rate from 34.9% for the baseline system to 28.7%
Keywords
compensation; covariance matrices; parameter estimation; speech recognition; voice mail; AT&T; automatic transcription; channel conditions; compensation techniques; linear transformations; robust estimation; semi-tied covariances; transcription accuracy; unsupervised clustering; variance adaptation; vocal tract length normalization; voicemail; Clustering algorithms; Decorrelation; Error analysis; Loudspeakers; Maximum likelihood linear regression; Parameter estimation; Postal services; Robustness; Speech analysis; Voice mail;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location
Salt Lake City, UT
ISSN
1520-6149
Print_ISBN
0-7803-7041-4
Type
conf
DOI
10.1109/ICASSP.2001.940758
Filename
940758
Link To Document