• DocumentCode
    124189
  • Title

    Integrating Pinyin to Improve Spelling Errors Detection for Chinese Language

  • Author

    Peng Jin ; Xingyuan Chen ; Zhaoyi Guo ; Pengyuan Liu

  • Author_Institution
    Lab. of Intell. Inf. Process., Leshan Normal Univ., Leshan, China
  • Volume
    1
  • fYear
    2014
  • fDate
    11-14 Aug. 2014
  • Firstpage
    455
  • Lastpage
    458
  • Abstract
    Most Chinese texts are inputted with keyboard via two input methods: Pinyin and Wubi, especially by Pinyin input method. In this paper, this users´ habitation is used to find the spelling errors automatically. We first train a Chinese character form n-gram language model on a large scale Chinese corpus in the traditional way. In order to improve this character based model, we transform the whole corpus into Pinyin to obtain Pinyin based language model. Fatherly, the tone is considered to get the third model. Integrating these three models, we improve the performance of checking spelling error system. Experimental results demonstrate the effeteness of our model.
  • Keywords
    computational linguistics; error detection; natural language processing; spelling aids; Chinese character form n-gram language model; Chinese language; Chinese texts; Pinyin based language model; Pinyin input method; Wubi input method; spelling error detection; Computational modeling; Conferences; Educational institutions; Information processing; Integrated circuit modeling; Keyboards; Pinyin; n-gram language model; spelling error;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
  • Conference_Location
    Warsaw
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2014.71
  • Filename
    6927580