• DocumentCode
    1749663
  • Title

    Linear feature space projections for speaker adaptation

  • Author

    Saon, George ; Zweig, Geoflrey ; Padmanabhan, Mukund

  • Author_Institution
    IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA
  • Volume
    1
  • fYear
    2001
  • fDate
    2001
  • Firstpage
    325
  • Abstract
    We extend the well-known technique of constrained maximum likelihood linear regression (MLLR) to compute a projection (instead of a full rank transformation) on the feature vectors of the adaptation data. We model the projected features with phone-dependent Gaussian distributions and also model the complement of the projected space with a single class-independent, speaker-specific Gaussian distribution. Subsequently, we compute the projection and its complement using maximum likelihood techniques. The resulting ML transformation is shown to be equivalent to performing a speaker-dependent heteroscedastic discriminant (or HDA) projection. Our method is in contrast to traditional approaches which use a single speaker-independent projection, and execute speaker adaptation in the resulting subspace. Experimental results on Switchboard show a 3% relative improvement in the word error rate over constrained MLLR in the projected subspace only
  • Keywords
    Gaussian distribution; adaptive systems; feature extraction; maximum likelihood estimation; speech recognition; ML transformation; adaptation data; class independent Gaussian distribution; constrained maximum likelihood linear regression; feature extraction; feature vectors; linear feature space projection; phone-dependent Gaussian distribution; speaker adaptation; speaker-dependent heteroscedastic discriminant; speaker-specific Gaussian distribution; speech recognition systems; word error rate; Cepstral analysis; Error analysis; Gaussian distribution; Linear discriminant analysis; Linear regression; Loudspeakers; Maximum likelihood linear regression; Modems; Subspace constraints; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2001. Proceedings. (ICASSP '01). 2001 IEEE International Conference on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-7041-4
  • Type

    conf

  • DOI
    10.1109/ICASSP.2001.940833
  • Filename
    940833