• DocumentCode
    700143
  • Title

    Non-parallel hierarchical training for voice conversion

  • Author

    Mesbahi, Larbi ; Barreaud, Vincent ; Boeffard, Olivier

  • Author_Institution
    ENSSAT, Univ. of Rennes 1, Lannion, France
  • fYear
    2008
  • fDate
    25-29 Aug. 2008
  • Firstpage
    1
  • Lastpage
    5
  • Abstract
    Many research topics in speech processing face the same difficult problem, how to create cheaply (or quickly) a parallel corpuswhich associates the acoustic realizations of two speakers having pronounced the same linguistic content. Among those topics are voice conversion techniques and some aspects of speech and speaker recognition. In the context of voice conversion, we propose a new methodology to map the source speaker vectors with those of a target speaker, without any parallel corpus nor using DTW (Dynamic Time Warping). The proposed approach is based on a hierarchical decomposition of the source and target acoustic spaces. At each level, source and target class centroids of a reduced subspace are paired. We propose an evaluation of our algorithm when applied to GMM-based voice conversion on the ARCTIC database.
  • Keywords
    Gaussian processes; acoustic signal processing; mixture models; speaker recognition; speech processing; ARCTIC database; DTW; GMM-based voice conversion; dynamic time warping; hierarchical source decomposition; nonparallel hierarchical training; source speaker vector; speaker recognition; speech processing; speech recognition; target acoustic space; Europe; Joints; Mel frequency cepstral coefficient; Speech; Trajectory; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing Conference, 2008 16th European
  • Conference_Location
    Lausanne
  • ISSN
    2219-5491
  • Type

    conf

  • Filename
    7080675