• DocumentCode
    2017308
  • Title

    GMM-based voice conversion with explicit modelling on feature transform

  • Author

    Chen, Ling-Hui ; Ling, Zhen-Hua ; Guo, Wu ; Dai, Li-Rong

  • Author_Institution
    iFLYTEK Speech Lab., Univ. of Sci. & Technol. of China, Hefei, China
  • fYear
    2010
  • fDate
    Nov. 29 2010-Dec. 3 2010
  • Firstpage
    364
  • Lastpage
    368
  • Abstract
    In this paper, we propose a Gaussian mixture model (GMM) based voice conversion method using explicit feature transform models. A piecewise linear transform with stochastic bias is adopted to present the relationship between the spectral features of source and target speakers. This explicit transformations are integrated into the training of GMM for the joint probability density of source and target features. The maximum likelihood parameter generation algorithm with dynamic features is used to generate the converted spectral trajectories. Our method can model the cross-dimension correlations for the joint density GMM (JDGMM), while significantly decreasing computation cost comparing with JDGMM with full covariance. Experimental results show that the proposed method outperformed the conventional GMM-based method in cross-gender voice conversion.
  • Keywords
    maximum likelihood estimation; piecewise linear techniques; probability; speech synthesis; GMM based voice conversion; Gaussian mixture model; computation cost; converted spectral trajectory; cross dimension correlation; dynamic feature; explicit modelling; feature transform; joint probability density; maximum likelihood parameter generation algorithm; piecewise linear transform; source speaker; spectral feature; stochastic bias; target speaker; Computational modeling; Covariance matrix; Heuristic algorithms; Hidden Markov models; Speech; Training; Transforms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Chinese Spoken Language Processing (ISCSLP), 2010 7th International Symposium on
  • Conference_Location
    Tainan
  • Print_ISBN
    978-1-4244-6244-5
  • Type

    conf

  • DOI
    10.1109/ISCSLP.2010.5684869
  • Filename
    5684869