DocumentCode
3300723
Title
Ambiguity solution of pinyin segmentation in continuous Pinyin-to-Character conversion
Author
Wen, Juan ; Wang, Xiaojie ; Xu, Wenzhi ; Jiang, Huixing
Author_Institution
Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., Beijing
fYear
2008
fDate
19-22 Oct. 2008
Firstpage
1
Lastpage
7
Abstract
Chinese Pinyin-to-character conversion is a key technology in Chinese Pinyin input system. In sentence based Pinyin-to-character conversion, segmentation of Pinyin string has important influence on performance of Pinyin-to-character conversion. There are lots of ambiguities in segmentation of Pinyin string. This paper classifies them into overlap and combinational ambiguities, and proposes disambiguation algorithms for them respectively. We then combine ambiguity resolution with several different language model to implement Pinyin-to-character conversion task, experiments show a good performance brought by proposed algorithms.
Keywords
natural language processing; Pinyin string segmentation; ambiguity resolution; continuous Pinyin-to-character conversion; disambiguation algorithms; Dynamic programming; Heuristic algorithms; Lattices; Natural language processing; Natural languages; Search engines; Speech recognition; Speech synthesis; Statistical analysis; Statistics; Pinyin string segmentation; Pinyin-to-Character Conversion; combinational ambiguities; overlap ambiguities;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-4515-8
Electronic_ISBN
978-1-4244-2780-2
Type
conf
DOI
10.1109/NLPKE.2008.4906775
Filename
4906775
Link To Document