Two stage concatenation speech synthesis for embedded devices

Author

Dong-jian, Yue

Author_Institution

Sch. of Comput. Eng. & Sci., Shanghai Univ., Shanghai, China

fYear

2010

fDate

23-25 Nov. 2010

Firstpage

1652

Lastpage

1656

Abstract

Although high quality TTS engines based on concatenation speech synthesis have been developed and applied in many products (such as various call center or information inquiry systems) successfully, the limitation of memory storage and computational power of many embedded devices such as most of low-tier cellular phone obstacles their implementation. By accounting for the speech quality, memory storage, computational complexity and reusability of the CELP based vocoder module (generally resident on DSP of almost all cellular phones), a practical two stage concatenation speech synthesis scheme for low-tier phone based application is described in this paper. In the two stage framework, all the back-end processing of TTS engine is divided into two phases (parameters concatenating and waveform synthesizing) that are conducted by MCU and DSP of mobile phone respectively. Furthermore, a novel four case smooth concatenation method is proposed to accomplish the smoothing concatenation of speech unit efficiently.

Keywords

computational complexity; microcontrollers; mobile handsets; speech coding; speech synthesis; vocoders; CELP; DSP; MCU; TTS engine; back-end processing; cellular phone obstacle; computational complexity; concatenation speech synthesis; embedded devices; memory storage; mobile phone; parameters concatenating; smooth concatenation method; speech coding; speech quality; vocoder module; waveform synthesizing; Smoothing methods; Speech; Speech coding; Speech synthesis; Vocoders;

fLanguage

English

Publisher

ieee

Conference_Titel

Audio Language and Image Processing (ICALIP), 2010 International Conference on

Conference_Location

Shanghai

Print_ISBN

978-1-4244-5856-1

Type

conf

DOI

10.1109/ICALIP.2010.5685082

Filename

5685082