DocumentCode :
3124368
Title :
A study of F0 modelling and generation with lyrics and shape characterization for singing voice synthesis
Author :
Lee, S.W. ; Minghui Dong ; Haizhou Li
Author_Institution :
Human Language Technol. Dept., A*STAR, Singapore, Singapore
fYear :
2012
fDate :
5-8 Dec. 2012
Firstpage :
150
Lastpage :
154
Abstract :
Natural pitch fluctuation is essential to singing voice. Recently, we have proposed a generalized F0 modelling method which models the expected F0 fluctuation under various contexts with note HMMs. Knowing that having F0 contours close to human professional singing promotes perceived quality, we are confronted with two requirements: (1) accurate estimation on F0 and (2) precise voiced/unvoiced decisions. In this paper, we introduce two techniques in the above directions. Influence of lyrics phonetics on singing F0 is considered to capture the F0 and voicing behaviour brought from different note-lyrics combinations. The generalized F0 modelling method is further extended to frequency-domain to study if shape characterization in terms of sinusoids helps F0 estimation or not. Our experiments showed that the use of lyrics information leads to better F0 generation and improves naturalness of synthesized singing. While the frequency-domain representation is viable, its performance is less competitive than time-domain representation, which requires further study.
Keywords :
frequency-domain analysis; hidden Markov models; speech synthesis; F0 fluctuation; F0 generation method; F0 modelling method; HMM; frequency-domain representation; human professional singing; lyrics characterization; natural pitch fluctuation; note-lyrics combinations; precise unvoiced decisions; precise voiced decisions; shape characterization; singing voice synthesis; time-domain representation; Context; Discrete cosine transforms; Fluctuations; Frequency domain analysis; Hidden Markov models; Time domain analysis; Training; lyrics; modelling; pitch; singing; synthesis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
Type :
conf
DOI :
10.1109/ISCSLP.2012.6423491
Filename :
6423491
Link To Document :
بازگشت