• DocumentCode
    3481893
  • Title

    Robust voice conversion methods

  • Author

    Turk, Oytun ; Arslan, Levent M.

  • Author_Institution
    Bogazici Univ., Istanbul, Turkey
  • fYear
    2004
  • fDate
    28-30 April 2004
  • Firstpage
    264
  • Lastpage
    267
  • Abstract
    Several problems in the training and transformation stages of voice conversion algorithms cause reduction in the output quality. This study focuses on the improvement of output quality in STASC based voice conversion and proposes five new methods. Robust end-point detection, pre-emphasis and spectral equalization are the three new methods that are employed in the training stage. The fourth method employs confidence measures for eliminating source and target HMM states that are significantly different in terms of duration, vocal tract spectrum, pitch, and energy. The last method focuses on the improvement of the pitch detection method. The optimal parameters of an autocorrelation based pitch detector are determined for male and female speakers separately with detailed analysis. The f0 values obtained from electro-glottograph signals are used as the reference. The algorithm that employs the proposed methods is compared with STASC in subjective listening tests. The similarity to the target voice is increased by 23.0% and the subjective quality by 28.8% with the new methods.
  • Keywords
    correlation methods; hidden Markov models; spectral analysis; speech synthesis; STASC based voice conversion; autocorrelation based pitch detector; confidence measures; electro-glottograph signals; female speakers; male speakers; optimal parameters; pitch detection; pre-emphasis; robust end-point detection; robust voice conversion; source HMM states; spectral equalization; target HMM states; training; Detectors; Energy measurement; Gaussian processes; Hidden Markov models; Robustness; Testing; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Communications Applications Conference, 2004. Proceedings of the IEEE 12th
  • Print_ISBN
    0-7803-8318-4
  • Type

    conf

  • DOI
    10.1109/SIU.2004.1338310
  • Filename
    1338310