On the incremental addition of regression classes for speaker adaptation

Author

McDonough, John ; Venkatoramani, V. ; Byrne, William

Author_Institution

Center for Language & Speech Processing, Johns Hopkins Univ., Baltimore, MD, USA

Volume

3

fYear

2000

fDate

2000

Firstpage

1539

Abstract

We previously proposed the all-pass transform (APT) as the basis of a speaker adaptation scheme intended for use with a large vocabulary speech recognition system. It was shown that APT-based adaptation reduces to a linear transformation of cepstral means, much like the better known maximum likelihood linear regression (MLLR). Due to this linearity, APT-based adaptation can be used in conjunction with speaker-adapted training (SAT), an algorithm for performing maximum likelihood estimation of the parameters of a hidden Markov model when speaker adaptation is to be employed during both training and test. In other work, we proposed a refinement of SAT dubbed single-pass adapted training (SPAT) specifically-tailored for use with the APT. Here we introduce an incremental training procedure intended for use with the APT and multiple regression classes. In a set of speech recognition experiments conducted on the Switchboard Corpus, we obtained a word error rate of 37.9% using APT adaptation, a significant improvement over the 39.5% word error rate achieved with MLLR

Keywords

cepstral analysis; hidden Markov models; maximum likelihood estimation; speech recognition; transforms; SAT; SPAT; all-pass transform; cepstral means; hidden Markov model; incremental addition; large vocabulary speech recognition system; linear transformation; maximum likelihood estimation; regression classes; single-pass adapted training; speaker adaptation; speaker-adapted training; word error rate; Cepstral analysis; Error analysis; Hidden Markov models; Linearity; Maximum likelihood estimation; Maximum likelihood linear regression; Performance evaluation; Speech recognition; Testing; Vocabulary;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on

Conference_Location

Istanbul

ISSN

1520-6149

Print_ISBN

0-7803-6293-4

Type

conf

DOI

10.1109/ICASSP.2000.861954

Filename

861954