DocumentCode :
1252120
Title :
Speaker clustering and transformation for speaker adaptation in speech recognition systems
Author :
Padmanabhan, Mukund ; Bahl, Lalit R. ; Nahamoo, David ; Picheny, Michael A.
Author_Institution :
IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
Volume :
6
Issue :
1
fYear :
1998
fDate :
1/1/1998 12:00:00 AM
Firstpage :
71
Lastpage :
77
Abstract :
A speaker adaptation strategy is described that is based on finding a subset of speakers, from the training set, who are acoustically close to the test speaker, and using only the data from these speakers (rather than the complete training corpus) to reestimate the system parameters. Further, a linear transformation is computed for every one of the selected training speakers to better map the training speaker´s data to the test speaker´s acoustic space. Finally, the system parameters (Gaussian means) are reestimated specifically for the test speaker using the transformed data from the selected training speakers. Experiments showed that this scheme is capable of providing an 18% relative improvement in the error rate on a large-vocabulary task with the use of as little as three sentences of adaptation data
Keywords :
Gaussian distribution; parameter estimation; speech processing; speech recognition; Gaussian means; adaptation data; computational complexity; error rate; experiments; large-vocabulary task; linear transformation; sentences; speaker adaptation; speaker clustering; speaker transformation; speech recognition systems; system parameters; system parameters reestimation; test speaker acoustic space; training set; training speaker data; Acoustic testing; Error analysis; Loudspeakers; Parameter estimation; Robustness; Signal processing; Speech recognition; System testing; Training data; Vectors;
fLanguage :
English
Journal_Title :
Speech and Audio Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1063-6676
Type :
jour
DOI :
10.1109/89.650313
Filename :
650313
Link To Document :
بازگشت