DocumentCode
2801305
Title
Factor analyzed voice models for HMM-based speech synthesis
Author
Kazumi, Kyosuke ; Nankaku, Yoshihiko ; Tokuda, Keiichi
Author_Institution
Nagoya Inst. of Technol., Nagoya, Japan
fYear
2010
fDate
14-19 March 2010
Firstpage
4234
Lastpage
4237
Abstract
This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.
Keywords
hidden Markov models; interpolation; maximum likelihood estimation; principal component analysis; speaker recognition; speech synthesis; HMM set interpolation; HMM-based speech synthesis; PCA; contextual decision tree; factor analyzed voice model; maximum likelihood estimation; principal component analysis; speaker-dependent HMM set; Algorithm design and analysis; Annealing; Character generation; Decision trees; Hidden Markov models; Maximum likelihood estimation; Principal component analysis; Speech analysis; Speech synthesis; Training data; HMM-based speech synthesis; deterministic annealing EM algorithm; eigenvoice; expectation maximization algorithm; factor analysis;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
Conference_Location
Dallas, TX
ISSN
1520-6149
Print_ISBN
978-1-4244-4295-9
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2010.5495689
Filename
5495689
Link To Document