DocumentCode
566965
Title
Based on support vector and word features new word discovery research
Author
Chengcheng, Li ; Yuanfang, Xu
Author_Institution
Sch. of Comput. & Inf. Eng., Inner Mongolia Normal Univ., Hohhot, China
Volume
1
fYear
2012
fDate
25-27 May 2012
Firstpage
698
Lastpage
701
Abstract
Chinese word segmentation is difficult to deal with ambiguity and unknown words recognition, this paper proposes the new word mode features as well as various word internal patterns from the training corpus of positive and negative samples to quantify extraction, and then through the training of support vector machine to get new support vector classification. On the test corpus with absolute discounting method new candidate extraction and selection, and with the training corpus to extract word patterns to quantify the new support vector classification for support vector machine test, through a portion of the rule filter to get the final word recognition results.
Keywords
natural language processing; pattern classification; support vector machines; word processing; Chinese word segmentation; absolute discounting method; negative samples; positive samples; rule filter; support vector classification; support vector machine training; test corpus training; unknown word recognition; word discovery research; word internal patterns; word mode features; word pattern extraction; Classification algorithms; Computers; Educational institutions; Feature extraction; Statistical analysis; Support vector machines; Training; Natural language processing; support vector machine; word feature; word recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Automation Engineering (CSAE), 2012 IEEE International Conference on
Conference_Location
Zhangjiajie
Print_ISBN
978-1-4673-0088-9
Type
conf
DOI
10.1109/CSAE.2012.6272688
Filename
6272688
Link To Document