• DocumentCode
    538078
  • Title

    LEXiTRON-Pro Editor: An integrated tool for developing Thai pronunciation dictionary

  • Author

    Klaithin, Supon ; Chootrakool, Patcharika ; Kosawat, Krit

  • Author_Institution
    Human Language Technol. Lab. (HLT), Nat. Electron. & Comput. Technol. Center (NECTEC), Pathumthani, Thailand
  • fYear
    2010
  • fDate
    18-20 Oct. 2010
  • Firstpage
    429
  • Lastpage
    433
  • Abstract
    Pronunciation dictionary is a crucial part for both Text-To-Speech and Automatic Speech Recognition systems. In this paper, we propose a tool to easily create and edit Thai pronunciation dictionary, called LEXiTRON-Pro Editor. This tool integrates Thai word segmentation, Thai Grapheme-to-Phoneme (G2P) conversion, and database system with statistics. It automatically proposes a word´s pronunciation to users by 1 of the 3 options in the successive order: the pronunciation from LEXiTRON-Pro database, the pronunciation combined from syllables with highest probability, and the pronunciation from Thai G2P. However, users can switch to another option or even directly input their own pronunciation with an easy interface editor. Our LEXiTRON-Pro database contains initially 105,129 unique words and 24,736 unique syllables with pronunciations. Compared to the previous version, our new program can reduce the process of dictionary development from 5 to only 1 step and the number of tools used by linguists from 3 to only 1. Moreover, our experiment shows that the time consumption and the number of ungenerable words are significantly reduced while the pronunciation accuracy is considerably improved.
  • Keywords
    speech recognition; speech synthesis; LEXiTRON Pro editor; Thai pronunciation dictionary; automatic speech recognition; grapheme to phoneme conversion; text to speech conversion; Barium; Computer science; Information technology; Iron;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Information Technology (IMCSIT), Proceedings of the 2010 International Multiconference on
  • Conference_Location
    Wisla
  • ISSN
    2157-5525
  • Print_ISBN
    978-1-4244-6432-6
  • Type

    conf

  • DOI
    10.1109/IMCSIT.2010.5679947
  • Filename
    5679947