Do not build your TTS training corpus randomly

Author

Jonathan Chevelu;Damien Lolive

Author_Institution

IRISA - University of Rennes 1, Lannion, France

fYear

2015

Firstpage

350

Lastpage

354

Abstract

TTS voice building generally relies on a script extracted from a big text corpus while optimizing the coverage of linguistic and phonological events supposedly related to voice acoustic quality. Previous works have shown differences on objective measures between smartly reduced and random corpora, but not when subjective evaluations are performed. For us, those results do not come from corpus reduction utility but from evaluations that smooth differences. In this article, we high-light those differences in a subjective test, by clustering test corpora according to a distance between signals so as to focus on different synthesized stimuli. The results show that covering appropriate features has a real impact on the perceived quality.

Keywords

"Speech","Europe","Speech synthesis","Greedy algorithms","Pragmatics","Buildings"

Publisher

ieee

Conference_Titel

Signal Processing Conference (EUSIPCO), 2015 23rd European

Electronic_ISBN

2076-1465

Type

conf

DOI

10.1109/EUSIPCO.2015.7362403

Filename

7362403

Link To Document

https://search.isc.ac/dl/search/defaultta.aspx?DTC=49&DC=3715856