• DocumentCode
    703254
  • Title

    Combining Bayesian learning and vector field smoothing for on-line incremental speaker adaptation

  • Author

    Vair, C. ; Fissore, L.

  • Author_Institution
    CSELT - Centro Studi E Lab. Telecomun., Turin, Italy
  • fYear
    1998
  • fDate
    8-11 Sept. 1998
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    In this paper we investigate the combination of Bayesian Learning (also known as Maximum A Posteriori-MAP) and Vector Field Smoothing (VFS) to the on-line incremental speaker adaptation of Continuous Density Hidden Markov Models (CDHMMs). The parameters of the Gaussian mixture output densities are adapted during the MAP step, using the exponential forgetting mechanism and performing the a-priori parameter estimation in a model based outline. The unadapted MAP models are then reestimated using the VFS technique. Several tests compare the error rate reduction as a function of the incremental adaptation step and of the size of adaptation data for each step. The experiments were run on a speaker dependent continuous speech recognition task, for the Italian language, with a test vocabulary size of 247 words, without any language models. The initial speaker independent models were trained on PSTN speech data, while the adaptation data were collected in a quiet environment through a PBX connection. The cepstral mean normalization (CMN) was used to deal with the acoustic mismatch.
  • Keywords
    Bayes methods; Gaussian processes; hidden Markov models; learning (artificial intelligence); maximum likelihood estimation; mixture models; natural languages; speaker recognition; Bayesian learning; CDHMM; CMN; Gaussian mixture output densities; Italian language; MAP models; PBX connection; VFS technique; cepstral mean normalization; continuous density hidden Markov models; continuous speech recognition task; error rate reduction; exponential forgetting mechanism; maximum a posteriori; online incremental speaker adaptation; parameter estimation; speaker independent models; vector field smoothing; Acoustics; Adaptation models; Computational modeling; Data models; Silicon; Speech; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference (EUSIPCO 1998), 9th European
  • Conference_Location
    Rhodes
  • Print_ISBN
    978-960-7620-06-4
  • Type

    conf

  • Filename
    7089725