Title :
Study on incorporating tone into speech recognition of Vietnamese
Author :
Thien Chuong Nguyen ; Chaloupka, Josef ; Nouza, Jan
Author_Institution :
Inst. of Inf. Technol. & Electron., Tech. Univ. of Liberec, Liberec, Czech Republic
Abstract :
Vietnamese is a syllable-based tonal language where the tone used in syllable pronunciation carries important information about the meaning. In this paper, we investigate several approaches how to incorporate the tone into an acoustic model. We propose 3 basic strategies: a) a phoneme-based, b) a vowel-based, and c) a rhyme-based one. Each can be modified so that we obtain 15 different schemes that are described and compared in experiments performed within the framework of large-vocabulary continuous speech recognition of Vietnamese. We show that the phoneme-based context dependent model performs best, particularly when information about the tone is linked to the syllable end. On the test set, made of 85 minutes of mostly broadcast speech, we achieve 74% syllable accuracy rate. The accuracy is further improved to 78% when the pronunciation lexicon and the language model takes into account also 40,000 most frequent syllable pairs.
Keywords :
natural language processing; speech recognition; Vietnamese large-vocabulary continuous speech recognition; acoustic model; broadcast speech; phoneme-based context dependent model; phoneme-based strategy; rhyme-based strategy; syllable pronunciation lexicon; syllable-based tonal language; vowel-based strategy; Accuracy; Acoustics; Dictionaries; Gold; Hidden Markov models; Speech; Speech recognition; language model; speech recognition of Vietnamese; syllable modeling; tonal language;
Conference_Titel :
Electronics, Control, Measurement, Signals and their Application to Mechatronics (ECMSM), 2015 IEEE International Workshop of
Conference_Location :
Liberec
Print_ISBN :
978-1-4799-6970-8
DOI :
10.1109/ECMSM.2015.7208688