Title :
Speaker compensation with sine-log all-pass transforms
Author :
McDonough, John ; Metze, Florian ; Soltau, Hagen ; Waibel, Alex
Author_Institution :
Interactive Syst. Labs., Karlsruhe Univ., Germany
Abstract :
In previous work, we proposed the rational all-pass transform (RAPT) as the basis of a speaker adaptation scheme intended for use with a large vocabulary speech recognition system. It was shown that RAPT-based adaptation reduces to a linear transformation of cepstral means, much like the better known maximum likelihood linear regression (MLLR). In a set of speech recognition experiments conducted on the Switchboard Corpus, we obtained a word error rate (WER) of 37.9% using RAPT adaptation, a significant improvement over the 39.5% WER achieved with MLLR. In the present work, we propose the sine-log all-pass transform (SLAPT) as a replacement for the RAPT. Our findings indicate the SLAPT is just as effective as the RAPT at reducing WER when used as the basis for a variety of speaker compensation schemes, but in addition conduces to far more tractable computation of transformed cepstral sequences, and the estimation of optimal transform parameters
Keywords :
cepstral analysis; compensation; error statistics; parameter estimation; speech recognition; transforms; RAPT; SLAPT; WER; large vocabulary speech recognition system; optimal transform parameters estimation; rational all-pass transform; sine-log all-pass transforms; speaker adaptation scheme; speaker compensation; speaker compensation schemes; transformed cepstral sequences; word error rate; Cepstral analysis; Error analysis; Interactive systems; Laboratories; Linearity; Maximum likelihood estimation; Maximum likelihood linear regression; Speech recognition; Transforms; Vocabulary;
Conference_Titel :
Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
Conference_Location :
Salt Lake City, UT
Print_ISBN :
0-7803-7041-4
DOI :
10.1109/ICASSP.2001.940844