DocumentCode
3101589
Title
Refining Unit Boundaries for Mandarin Text-to-Speech Database
Author
Dong, Minghui ; Cen, Ling ; Chan, Paul ; Li, Haizhou
Author_Institution
Inst. for Infocomm Res. (I2R), A*STAR, Singapore, Singapore
fYear
2009
fDate
7-9 Dec. 2009
Firstpage
245
Lastpage
248
Abstract
In unit selection based text-to-speech (TTS) synthesis, the accurate position of the unit boundaries in the unit selection database is one of the factors that determine the quality of the synthesized speech. To ensure the accuracy of the boundary positions, developers often have to manually verify the speech boundaries that are generated by automatic speech recognition techniques. In order to reduce the manual workload, it is necessary to use automatic methods of refining the position of the unit boundaries. This paper proposes a frame-shift method to find the globally optimal joint position for unit concatenation between any two matching units. Experiment results show that this method can improve the boundary accuracy compared to manual labeling.
Keywords
database management systems; speech recognition; speech synthesis; Mandarin text-to-speech database; automatic speech recognition techniques; frame-shift method; unit selection based text-to-speech synthesis; Automatic speech recognition; Databases; Feature extraction; Hidden Markov models; Mel frequency cepstral coefficient; Natural languages; Optimization methods; Signal processing algorithms; Speech processing; Speech synthesis; optimization; speech synthesis; unit boundary; unit selection;
fLanguage
English
Publisher
ieee
Conference_Titel
Asian Language Processing, 2009. IALP '09. International Conference on
Conference_Location
Singapore
Print_ISBN
978-0-7695-3904-1
Type
conf
DOI
10.1109/IALP.2009.59
Filename
5380742
Link To Document