Automatic generation of synthesis units based on context oriented clustering

Author

Nakajima, Shin-Ya ; Hamada, Hiroshi

Author_Institution

NTT Human Interface Lab., Kanagawa, Japan

fYear

1988

fDate

11-14 Apr 1988

Firstpage

659

Abstract

The authors propose a text-to-speech synthesis method based on automatic synthesis unit generation techniques using a natural speech database. They have termed the automatic procedure context oriented clustering (COC). Using the COC procedure, 627 phonetic synthesis units were generated automatically based on 432 words uttered by a male speaker. This systematic approach has several advantages. First, as synthesis units can be generated automatically without any a priori phonological knowledge, it is easy to change the number of units and voices. Second, following from this, the technique can be applied to any language. Third, the generation of allophonic synthesis units is not dependent on the human decisions but on the statistical characteristics of spectral parameters in natural speech. Thus, the generated units are more consistent that those obtained through other methods, with the result that more intelligible speech can be reconstructed

Keywords

database management systems; speech synthesis; allophonic synthesis units; automatic synthesis unit generation; context oriented clustering; male speaker; natural speech; natural speech database; phonetic synthesis units; phonological knowledge; spectral parameters; statistical characteristics; text-to-speech synthesis; Character generation; Databases; Humans; Laboratories; Manuals; Natural languages; Speech coding; Speech recognition; Speech synthesis; Synthesizers;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech, and Signal Processing, 1988. ICASSP-88., 1988 International Conference on

Conference_Location

New York, NY

ISSN

1520-6149

Type

conf

DOI

10.1109/ICASSP.1988.196672

Filename

196672