Title :
Grapheme-to-phoneme conversion methods for minority language conditions
Author :
Mengxue Cao ; Renals, Steve ; Bell, P. ; Aijun Li ; Qiang Fang
Author_Institution :
Phonetics Lab., Chinese Acad. of Social Sci., Beijing, China
Abstract :
This study attempts to investigate the grapheme-to-phoneme conversion approaches for minority language conditions. Instead of isolated-word data for major languages, sentence-form data is defined to be a proper form of training data for minority languages. Joint-multigram Model and Hidden Markov Model were examined in this study. The “treat-sentence-as-word” training method and the forced-alignment process were proposed to extend the Joint-multigram Model and the Hidden Markov Model respectively to meet the minority language conditions. Results get from the sentence-form training data using our proposed methods are as good as the results get from the isolated-word training data using previous proposed methods. The Joint-multigram Model performs better for well-designed training data, while the Hidden Markov Model has more error capacity and is more proper for minority language conditions.
Keywords :
hidden Markov models; natural language processing; speech processing; speech recognition; speech synthesis; word processing; error capacity; forced-alignment process; grapheme-to-phoneme conversion methods; hidden Markov model; joint-multigram model; minority language conditions; sentence-form training data; treat-sentence-as-word training method; Context modeling; Data models; Hidden Markov models; Speech; Speech recognition; Training; Training data; Grapheme-to-phoneme; HMM; Joint-multigram Model; forced-alignment; treat-sentence-as-word;
Conference_Titel :
Speech Database and Assessments (Oriental COCOSDA), 2012 International Conference on
Conference_Location :
Macau
Print_ISBN :
978-1-4673-2811-1
Electronic_ISBN :
978-1-4673-2812-8
DOI :
10.1109/ICSDA.2012.6422470