Title :
Compression of line spectral frequency parameters with asynchronous interpolation
Author :
Moldover, Rachel ; Kain, Alexander
Abstract :
TTS systems require a trade-off between size and speech quality. A larger acoustic inventory allows synthesis of speech that sounds more natural. The Asynchronous Interpolation Model improves the quality to size ratio, allowing better compression of large acoustic inventories, as well as better quality speech from a small system. At maximum compression, our method represents most phonemes by a single frame of data. Coarticulation effects are specified as context-specific non-linear interpolation functions. Dividing the speech features into multiple data streams allows asynchronous interpolation. In this study, AIM was applied to LSF parameters. Varying the number of streams allows for variable amount of compression. We used three different objective measures to investigate the effect of number and partitioning of streams. The first few weight functions (and the last one) seem to offer the most error reduction. Partitions separating the first 6 LSFs score well with all three measures.
Keywords :
interpolation; speech synthesis; acoustic inventory; asynchronous interpolation; coarticulation effects; error reduction; line spectral frequency; speech features; speech quality; speech synthesis; temporal decomposition; text-to-speech; Acoustic measurements; Acoustical engineering; Codecs; Frequency; Interpolation; Natural languages; Network synthesis; Speech synthesis; Telephony; Vectors; TTS; acoustic inventory; compression; speech synthesis; temporal decomposition;
Conference_Titel :
Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on
Conference_Location :
Taipei
Print_ISBN :
978-1-4244-2353-8
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2009.4960452