• DocumentCode
    588746
  • Title

    Analysis on Effect Range of Context in Chinese Word Segmentation Based Word-Position Tagging

  • Author

    Xijie Wang ; An Guo

  • Author_Institution
    Sch. of Comput. & Inf. Eng., Anyang Normal Univ., Anyang, China
  • fYear
    2012
  • fDate
    2-4 Nov. 2012
  • Firstpage
    552
  • Lastpage
    555
  • Abstract
    Chinese word segmentation (CWS) can be transformed into word-position-based approaches by conditional random field (CRF). It improved the performance of Chinese word segmentation greatly which makes it in wide use recently. When training on corpus with CRF, the size of feature window is the key to the training effect. To analyze the effect range of context, string sequence tagging segmentations are performed on Bakeoff2005 with toolkit CRF++0.53 and the results are: (1) contribution of below is greater than above, (2) size of feature window influencing the segment performance is no larger than 5, the proper size is four or five.
  • Keywords
    identification technology; natural language processing; random processes; word processing; Bakeoff2005; CRF++0.53 toolkit; CWS; Chinese word segmentation; conditional random field; effect range analysis; feature window; string sequence tagging segmentation; training effect; word-position tagging; Computers; Context; Educational institutions; Indexes; Performance analysis; Tagging; Training; Chinese Word Segmentation; Conditional Random Field; Context; feature window;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Information Networking and Security (MINES), 2012 Fourth International Conference on
  • Conference_Location
    Nanjing
  • Print_ISBN
    978-1-4673-3093-0
  • Type

    conf

  • DOI
    10.1109/MINES.2012.76
  • Filename
    6405616