• DocumentCode
    3300723
  • Title

    Ambiguity solution of pinyin segmentation in continuous Pinyin-to-Character conversion

  • Author

    Wen, Juan ; Wang, Xiaojie ; Xu, Wenzhi ; Jiang, Huixing

  • Author_Institution
    Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., Beijing
  • fYear
    2008
  • fDate
    19-22 Oct. 2008
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Chinese Pinyin-to-character conversion is a key technology in Chinese Pinyin input system. In sentence based Pinyin-to-character conversion, segmentation of Pinyin string has important influence on performance of Pinyin-to-character conversion. There are lots of ambiguities in segmentation of Pinyin string. This paper classifies them into overlap and combinational ambiguities, and proposes disambiguation algorithms for them respectively. We then combine ambiguity resolution with several different language model to implement Pinyin-to-character conversion task, experiments show a good performance brought by proposed algorithms.
  • Keywords
    natural language processing; Pinyin string segmentation; ambiguity resolution; continuous Pinyin-to-character conversion; disambiguation algorithms; Dynamic programming; Heuristic algorithms; Lattices; Natural language processing; Natural languages; Search engines; Speech recognition; Speech synthesis; Statistical analysis; Statistics; Pinyin string segmentation; Pinyin-to-Character Conversion; combinational ambiguities; overlap ambiguities;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
  • Conference_Location
    Beijing
  • Print_ISBN
    978-1-4244-4515-8
  • Electronic_ISBN
    978-1-4244-2780-2
  • Type

    conf

  • DOI
    10.1109/NLPKE.2008.4906775
  • Filename
    4906775