Segment pre-selection in decision-tree based speech synthesis systems

Author

Donovan, R.E.

Author_Institution

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

Volume

2

fYear

2000

fDate

2000

Abstract

Corpus based approaches to unit selection for concatenative speech synthesis have become popular in recent years due to their improved sensitivity to unit context over their more simple predecessors. These systems usually make use of large speech databases and employ sophisticated search algorithms to determine the optimal unit sequence to use to synthesise each sentence. For many applications it is not possible to have the entire database, which may be as large as several hundred megabytes, available to the synthesiser at runtime. What is required is some form of off-line pre-selection algorithm to determine which subset of the database enables the highest quality speech synthesis to be performed for a given runtime system size. This paper describes a pre-selection algorithm developed at IBM for use with decision-tree-based concatenative speech synthesisers

Keywords

decision trees; speech synthesis; IBM; concatenative speech synthesis; corpus based approaches; decision-tree based speech synthesis system; large speech databases; off-line pre-selection algorithm; optimal unit sequence; search algorithms; segment pre-selection; unit selection; Art; Databases; Degradation; Hidden Markov models; Image segmentation; Runtime; Signal processing; Signal synthesis; Speech synthesis; Training data;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on

Conference_Location

Istanbul

ISSN

1520-6149

Print_ISBN

0-7803-6293-4

Type

conf

DOI

10.1109/ICASSP.2000.859115

Filename

859115