DocumentCode :
3407145
Title :
Uyghur morpheme-based language models and ASR
Author :
Ablimit, Mijit ; Neubig, Graham ; Mimura, Masato ; Mori, Shinsuke ; Kawahara, Tatsuya ; Hamdulla, Askar
Author_Institution :
Sch. of Inf., Kyoto Univ., Kyoto, Japan
fYear :
2010
fDate :
24-28 Oct. 2010
Firstpage :
581
Lastpage :
584
Abstract :
Uyghur language is an agglutinative language in which words are formed by suffixes attaching to a stem (or root). Because of the explosive nature in vocabulary of the agglutinative languages, several morpheme-based language models are built and experiments are implemented. Morpheme is the smallest meaning bearing unit. In this research, morpheme is referred to any of prefix, stem, or suffix. As a result, a large vocabulary ASR system is built on the basis of Julius system. Several ASR results on language models based on different units (word, morpheme, and syllable) are compared.
Keywords :
natural language processing; vocabulary; ASR; Julius system; Uyghur morpheme-based language model; agglutinative language; meaning bearing unit; vocabulary; Error analysis; Hidden Markov models; Joining processes; Speech recognition; Surface morphology; Training; Vocabulary; Uyghur; language modeling ASR; morpheme segmenter;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing (ICSP), 2010 IEEE 10th International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-5897-4
Type :
conf
DOI :
10.1109/ICOSP.2010.5656065
Filename :
5656065
Link To Document :
بازگشت