• DocumentCode
    2895158
  • Title

    Automatic online text selection for constructing text corpus with custom phonetic distribution

  • Author

    Vorapatratorn, Surapol ; Suchato, Atiwong ; Punyabukkana, Proadpran

  • Author_Institution
    Dept. of Comput. Eng., Chulalongkorn Univ., Bangkok, Thailand
  • fYear
    2012
  • fDate
    May 30 2012-June 1 2012
  • Firstpage
    6
  • Lastpage
    11
  • Abstract
    Performance of Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) systems depends on an appropriate text corpus. This article explains about the automated text corpus generation method using custom phonetic distribution. This distribution is defined by phoneme types, corpus size, the minimum criterion number of phonemes, and target phonetic distribution. Generally, the system selects text data from the Internet by continuously downloading them using a web crawler. The greedy algorithm is applied to extract the proper sentences, in order to fit with the target phonetic distribution until the appropriate text corpus is established. The experiment is done by using the text from the Large Vocabulary Continuous Speech Recognition (LVCSR) corpus for Thai language [1] to generate the target phonetic distribution. The result shows that the increased number of data drawn from the Internet is able to accomplish the target phonetic distribution and generates diphone coverage for 99.13%. This text corpus, then, can be used to generate the speech corpus efficiently.
  • Keywords
    Internet; greedy algorithms; information retrieval; natural languages; speech recognition; text analysis; ASR; Internet; LVCSR corpus; TTS; Thai language; Web crawler; automated text corpus generation method; automatic online text selection; automatic speech recognition; corpus size; custom phonetic distribution; data downloading; diphone coverage generation; greedy algorithm; large-vocabulary continuous speech recognition corpus; phoneme minimum criterion number; phoneme types; proper sentence extraction; speech corpus generation; target phonetic distribution; text-to-speech systems; Databases; Equations; Greedy algorithms; Internet; Mathematical model; Speech; Vocabulary; greedy algorithm; online corpus; phonetic; phonetically balanced; sentence segmentation; text selection;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Software Engineering (JCSSE), 2012 International Joint Conference on
  • Conference_Location
    Bangkok
  • Print_ISBN
    978-1-4673-1920-1
  • Type

    conf

  • DOI
    10.1109/JCSSE.2012.6261916
  • Filename
    6261916