Title :
Automatic identifying of maximal length noun phrase
Author :
Yegang Li ; Heyan Huang
Author_Institution :
Sch. of Comput. Sci. & Technol., Beijing in stitute of Technol., Beijing, China
fDate :
Oct. 30 2012-Nov. 1 2012
Abstract :
The automatic recognition of the maximal-length noun phrase (MNP) helps to the shallow parsing. In this paper, automatic labeling of Chinese MNP is regarded as a sequential labeling task and Support Vector Machine model (SVM) is employed in the model. We propose a method which takes 2-phase hybrid approach which first identifies base chunk and then identifies MNP. Furthermore, the base chunk features can be exploited to improve performance of MNP recognition. In addition, both left-right and right-left sequential labeling were employed to identify Chinese MNP by bidirectional sequence labeling merging. The data set in the experiments is selected from Penn Chinese Treebank 5.0 Corpus, and split into train set, development set and test set according to the proportion of 4:4:1. Experimental result shows a high quality performance of 90.13% in F1-measure.
Keywords :
grammars; natural language processing; support vector machines; 2-phase hybrid approach; F1-measure; MNP recognition; Penn Chinese treebank 5.0 corpus; SVM; automatic maximal length noun phrase identification; base chunk; bidirectional sequence labeling merging; left-right sequential labeling; right-left sequential labeling; sequential labeling task; shallow parsing; support vector machine model; Cloud computing; Labeling; Magnetic heads; Merging; Support vector machines; Syntactics; Tagging; 2-phase; MNP; base chunk feature; bidirectional sequence labeling merging;
Conference_Titel :
Cloud Computing and Intelligent Systems (CCIS), 2012 IEEE 2nd International Conference on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4673-1855-6
DOI :
10.1109/CCIS.2012.6664624