• DocumentCode
    2500798
  • Title

    A real-time Thai speech synthesizer on a mobile device

  • Author

    Wongpatikaseree, K. ; Ratikan, A. ; Thangthai, A. ; Chotimongkol, A. ; Nattee, C.

  • Author_Institution
    Sirindhorn Int. Inst. of Technol., Thammasat Univ., Pathum Than, Thailand
  • fYear
    2009
  • fDate
    20-22 Oct. 2009
  • Firstpage
    42
  • Lastpage
    47
  • Abstract
    Several Thai TTS systems are already available on a resourceful platform such as a personal computer. However, porting these systems to a resource limited device such as a mobile phone is not an easy task. Practical aspects including application size and processing time have to be concerned. In this paper, we aim at developing a Thai speech synthesizer that can produce an output speech in real-time on a mobile device. Our synthesizer is based on Flite, an open source synthesis library developed by Carnegie Mellon University. Flite is suitable for a limited resource device as it is both small and fast. To use Flite as a text-to-speech engine for Thai, many components have to be modified. First, a word segmentation component and a Thai pronunciation dictionary are added to determine word boundaries and the pronunciation of each word in Thai input text. To minimize the resource, a simple word segmentation algorithm, a longest matching, is employed. Next, to handle the tones in Thai, we integrate tones with phones and define a tonal phone set for Thai. Lastly, a small Thai speech database is essential. For this, we transform a unit selection database into a diphone database by selecting only necessary diphones. We conducted an experiment to compare our speech synthesizer with pTalk, an HMM-based speech synthesizer, both in terms of speed and sound quality measured by a subjective listening test. While the quality of our output speech may not be as good as the output from pTalk, our system is much faster and more stable than pTalk.
  • Keywords
    audio databases; mobile computing; natural language processing; pattern matching; public domain software; real-time systems; software libraries; speech synthesis; Flite open source synthesis library; HMM-based speech synthesizer; Thai pronunciation dictionary; Thai speech database; application size; diphone database; mobile device; processing time; real-time Thai speech synthesizer; text-to-speech engine; tone handling; unit selection database; word segmentation component; Application software; Databases; Dictionaries; Engines; Libraries; Microcomputers; Mobile handsets; Speech synthesis; Synthesizers; Velocity measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing, 2009. SNLP '09. Eighth International Symposium on
  • Conference_Location
    Bangkok
  • Print_ISBN
    978-1-4244-4138-9
  • Electronic_ISBN
    978-1-4244-4139-6
  • Type

    conf

  • DOI
    10.1109/SNLP.2009.5340907
  • Filename
    5340907