Title :
Korean pronunciation variation modeling with probabilistic Bayesian networks
Author :
Sakti, Sakriani ; Finch, Andrew ; Isotani, Ryosuke ; Kawai, Hisashi ; Nakamura, Satoshi
Author_Institution :
MASTAR Project, Nat. Inst. of Inf. & Commun. Technol. (NICT), Japan
Abstract :
In Korean language, a large proportion of word units are pronounced differently from their written forms due to an agglutinative and highly inflective nature having severe phonological phenomena and coarticulation effects. This paper reports on an ongoing study of Korean pronunciation modeling, in which the mapping between phonemic and orthographic units is modeled by a Bayesian network (BN). The advantages of this graphical model framework is that the probabilistic relationship between these symbols as well as additional knowledge sources can be learned in a general and flexible way. Thus, we can easily incorporate various additional knowledge sources from different domains. In this preliminary study, we start with a simple topology where the additional knowledge only includes the preceding and succeeding contexts of the current phonemic unit. In practise, this proposed BN pronunciation model is applied on our syllable-based Korean large-vocabulary continuous speech recognition (LVCSR) system, where we construct the speech recognition task as a serial architecture composed of two independent parts. The first part is to perform standard hidden Markov model (HMM)-based recognition of phonemic syllable units of the actual pronunciation (surface forms). By this way, the lexicon dictionary and out-of-vocabulary rates can be kept small, while avoiding high acoustic confusability. In the second part, the system then transforms the phonemic syllable surface forms into the desirable Korean orthography eumjeol of a recognition unit, by utilizing the proposed BN pronunciation model. Experimental results show that the proposed BN model can successfully map the phonemic syllable surface forms to eumjeols transcription with more than 97% accuracy on average. It also revealed that it could help to enhance our Korean LVCSR system, and gave about 25.53% absolute improvement on average with respect to baseline orthographic syllable recognition.
Keywords :
belief networks; hidden Markov models; probability; speech recognition; BN; HMM; Korean language; Korean orthography; Korean pronunciation modeling; Korean pronunciation variation modeling; LVCSR; acoustic confusability; coarticulation effects; graphical model framework; hidden Markov model; inflective nature; large-vocabulary continuous speech recognition; orthographic syllable recognition; orthographic units; out-of-vocabulary rates; phonemic units; phonological phenomena; probabilistic Bayesian networks; Accuracy; Acoustics; Computational modeling; Hidden Markov models; Joints; Speech recognition; Training;
Conference_Titel :
Universal Communication Symposium (IUCS), 2010 4th International
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-7821-7
DOI :
10.1109/IUCS.2010.5666770