DocumentCode :
3300723
Title :
Ambiguity solution of pinyin segmentation in continuous Pinyin-to-Character conversion
Author :
Wen, Juan ; Wang, Xiaojie ; Xu, Wenzhi ; Jiang, Huixing
Author_Institution :
Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., Beijing
fYear :
2008
fDate :
19-22 Oct. 2008
Firstpage :
1
Lastpage :
7
Abstract :
Chinese Pinyin-to-character conversion is a key technology in Chinese Pinyin input system. In sentence based Pinyin-to-character conversion, segmentation of Pinyin string has important influence on performance of Pinyin-to-character conversion. There are lots of ambiguities in segmentation of Pinyin string. This paper classifies them into overlap and combinational ambiguities, and proposes disambiguation algorithms for them respectively. We then combine ambiguity resolution with several different language model to implement Pinyin-to-character conversion task, experiments show a good performance brought by proposed algorithms.
Keywords :
natural language processing; Pinyin string segmentation; ambiguity resolution; continuous Pinyin-to-character conversion; disambiguation algorithms; Dynamic programming; Heuristic algorithms; Lattices; Natural language processing; Natural languages; Search engines; Speech recognition; Speech synthesis; Statistical analysis; Statistics; Pinyin string segmentation; Pinyin-to-Character Conversion; combinational ambiguities; overlap ambiguities;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-4515-8
Electronic_ISBN :
978-1-4244-2780-2
Type :
conf
DOI :
10.1109/NLPKE.2008.4906775
Filename :
4906775
Link To Document :
بازگشت