• DocumentCode
    1064959
  • Title

    Speaker adaptation using combined transformation and Bayesian methods

  • Author

    Digal, Vassilios V. ; Neumeyer, Leonardo G.

  • Author_Institution
    Dept. of Electron. & Comput. Eng., Tech. Univ. of Crete, Chania, Greece
  • Volume
    4
  • Issue
    4
  • fYear
    1996
  • fDate
    7/1/1996 12:00:00 AM
  • Firstpage
    294
  • Lastpage
    300
  • Abstract
    Adapting the parameters of a statistical speaker independent continuous-speech recognizer to the speaker and the channel can significantly improve the recognition performance and robustness of the system. In continuous mixture-density hidden Markov models the number of component densities is typically very large, and it may not be feasible to acquire a sufficient amount of adaptation data for robust maximum-likelihood estimates. To solve this problem, we have recently proposed a constrained estimation technique for Gaussian mixture densities. To improve the behavior of our adaptation scheme for large amounts of adaptation data, we combine it here with Bayesian techniques. We evaluate our algorithms on the large-vocabulary Wall Street Journal corpus for nonnative speakers of American English. The recognition error rate is approximately halved with only a small amount of adaptation data, and it approaches the speaker-independent accuracy achieved for native speakers
  • Keywords
    Bayes methods; Gaussian processes; adaptive estimation; hidden Markov models; maximum likelihood estimation; speech recognition; American English; Bayesian method; Gaussian mixture densities; algorithms; constrained estimation technique; continuous mixture-density hidden Markov models; large-vocabulary Wall Street Journal corpus; native speakers; nonnative speakers; recognition error rate; recognition performance; robust maximum-likelihood estimates; robustness; speaker adaptation; statistical speaker independent continuous-speech recognizer; transformation method; Bayesian methods; Degradation; Error analysis; Hidden Markov models; Maximum likelihood estimation; Natural languages; Robustness; Speech recognition; Testing; Training data;
  • fLanguage
    English
  • Journal_Title
    Speech and Audio Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-6676
  • Type

    jour

  • DOI
    10.1109/89.506933
  • Filename
    506933