• DocumentCode
    566965
  • Title

    Based on support vector and word features new word discovery research

  • Author

    Chengcheng, Li ; Yuanfang, Xu

  • Author_Institution
    Sch. of Comput. & Inf. Eng., Inner Mongolia Normal Univ., Hohhot, China
  • Volume
    1
  • fYear
    2012
  • fDate
    25-27 May 2012
  • Firstpage
    698
  • Lastpage
    701
  • Abstract
    Chinese word segmentation is difficult to deal with ambiguity and unknown words recognition, this paper proposes the new word mode features as well as various word internal patterns from the training corpus of positive and negative samples to quantify extraction, and then through the training of support vector machine to get new support vector classification. On the test corpus with absolute discounting method new candidate extraction and selection, and with the training corpus to extract word patterns to quantify the new support vector classification for support vector machine test, through a portion of the rule filter to get the final word recognition results.
  • Keywords
    natural language processing; pattern classification; support vector machines; word processing; Chinese word segmentation; absolute discounting method; negative samples; positive samples; rule filter; support vector classification; support vector machine training; test corpus training; unknown word recognition; word discovery research; word internal patterns; word mode features; word pattern extraction; Classification algorithms; Computers; Educational institutions; Feature extraction; Statistical analysis; Support vector machines; Training; Natural language processing; support vector machine; word feature; word recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Automation Engineering (CSAE), 2012 IEEE International Conference on
  • Conference_Location
    Zhangjiajie
  • Print_ISBN
    978-1-4673-0088-9
  • Type

    conf

  • DOI
    10.1109/CSAE.2012.6272688
  • Filename
    6272688