Title :
A hybrid diphone speech unit and a speech corpus construction technique for a Thai text-to-speech system on mobile devices
Author :
Wongpatikaseree, K. ; Ratikan, A. ; Chotimongkol, A. ; Chootrakool, P. ; Nattee, C. ; Theeramunkong, T. ; Kobayashi, T.
Author_Institution :
Sirindhorn Int. Inst. of Technol., Thammasat Univ., Pathumthani, Thailand
Abstract :
Most Thai text-to-speech systems on personal computers can synthesize sound in real time with acceptable quality. However, when porting the Thai TTS systems to limited-resource systems such as mobile devices, computational time has to be reduced. Hence, the quality of synthesized sound is decreased. Even though Flite_Thai, a unit concatenation synthesizer for Thai, can reduce the computational time into a real time system, the output sound is quite unintelligible. In this paper, we aim at selecting the appropriate speech unit for Flite_Thai in order to improve its intelligibility. We design a new speech corpus that consists of three different speech units: demi-syllable, diphone and a new speech unit called hybrid diphone. We use a non-sense carrier sentence technique for recording this corpus since we focus more on clear articulation of each speech unit. Our carrier sentence contains a speech unit or a set of similar speech units per sentence without concerning the meaning. We compare the quality of speech synthesized using four types of speech units, a diphone from the TsynC corpus recorded with natural sentences, and the three types of units from the new corpus recorded with non-sense carrier sentences. In terms of intelligibility, all of the speech units from the new corpus achieved higher MOS (Mean Opinion Score) than the existing Flite_Thai system which uses speech units from TsynC. Among the three unit types in the news corpus, demi-syllable obtained the highest score. Although hybrid diphone obtained higher MOS than the existing system and the diphone, it still suffers from a similar problem which is unsmooth joints between units.
Keywords :
mobile radio; natural languages; speech processing; speech synthesis; Flite_Thai; Thai TTS system; Thai text-to-speech system; demi-syllable; hybrid diphone speech unit; mobile devices; nonsense carrier sentence technique; speech corpus construction technique; Electronic mail; High temperature superconductors; Information processing; Microcomputers; Mobile computing; Mobile handsets; Real time systems; Speech processing; Speech synthesis; Synthesizers;
Conference_Titel :
Electrical Engineering/Electronics Computer Telecommunications and Information Technology (ECTI-CON), 2010 International Conference on
Conference_Location :
Chaing Mai
Print_ISBN :
978-1-4244-5606-2
Electronic_ISBN :
978-1-4244-5607-9