• DocumentCode
    353622
  • Title

    On the incremental addition of regression classes for speaker adaptation

  • Author

    McDonough, John ; Venkatoramani, V. ; Byrne, William

  • Author_Institution
    Center for Language & Speech Processing, Johns Hopkins Univ., Baltimore, MD, USA
  • Volume
    3
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    1539
  • Abstract
    We previously proposed the all-pass transform (APT) as the basis of a speaker adaptation scheme intended for use with a large vocabulary speech recognition system. It was shown that APT-based adaptation reduces to a linear transformation of cepstral means, much like the better known maximum likelihood linear regression (MLLR). Due to this linearity, APT-based adaptation can be used in conjunction with speaker-adapted training (SAT), an algorithm for performing maximum likelihood estimation of the parameters of a hidden Markov model when speaker adaptation is to be employed during both training and test. In other work, we proposed a refinement of SAT dubbed single-pass adapted training (SPAT) specifically-tailored for use with the APT. Here we introduce an incremental training procedure intended for use with the APT and multiple regression classes. In a set of speech recognition experiments conducted on the Switchboard Corpus, we obtained a word error rate of 37.9% using APT adaptation, a significant improvement over the 39.5% word error rate achieved with MLLR
  • Keywords
    cepstral analysis; hidden Markov models; maximum likelihood estimation; speech recognition; transforms; SAT; SPAT; all-pass transform; cepstral means; hidden Markov model; incremental addition; large vocabulary speech recognition system; linear transformation; maximum likelihood estimation; regression classes; single-pass adapted training; speaker adaptation; speaker-adapted training; word error rate; Cepstral analysis; Error analysis; Hidden Markov models; Linearity; Maximum likelihood estimation; Maximum likelihood linear regression; Performance evaluation; Speech recognition; Testing; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on
  • Conference_Location
    Istanbul
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-6293-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2000.861954
  • Filename
    861954