Title :
Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis
Author :
Yamagishi, Junichi ; Nose, Takashi ; Zen, Heiga ; Ling, Zhen-Hua ; Toda, Tomoki ; Tokuda, Keiichi ; King, Simon ; Renals, Steve
Author_Institution :
Centre for Speech Technol. Res. (CSTR), Univ. of Edinburgh, Edinburgh, UK
Abstract :
This paper describes a speaker-adaptive HMM-based speech synthesis system. The new system, called ldquoHTS-2007,rdquo employs speaker adaptation (CSMAPLR+MAP), feature-space adaptive training, mixed-gender modeling, and full-covariance modeling using CSMAPLR transforms, in addition to several other techniques that have proved effective in our previous systems. Subjective evaluation results show that the new system generates significantly better quality synthetic speech than speaker-dependent approaches with realistic amounts of speech data, and that it bears comparison with speaker-dependent approaches even when large amounts of speech data are available. In addition, a comparison study with several speech synthesis techniques shows the new system is very robust: It is able to build voices from less-than-ideal speech data and synthesize good-quality speech even for out-of-domain sentences.
Keywords :
hidden Markov models; learning (artificial intelligence); speech synthesis; text analysis; full-covariance modeling; mixed-gender modeling; robust speaker-adaptive hidden Markov model; speaker-dependent approach; text-to-speech synthesis; Computer science; Continuous-stirred tank reactor; Councils; Hidden Markov models; High temperature superconductors; Information science; Nose; Robustness; Speech analysis; Speech synthesis; Average voice; HMM Speech Synthesis System, HTS; HMM-based speech synthesis; speaker adaptation; speech synthesis; voice conversion;
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
DOI :
10.1109/TASL.2009.2016394