Title :
Chinese Typed Collocation Extraction using Corpus-based Syntactic Collocation Patterns
Author :
LI, Wanyin ; Lu, Qin ; Liu, James
Author_Institution :
Hong Kong Polytech. Univ., Kowloon
fDate :
Aug. 30 2007-Sept. 1 2007
Abstract :
Collocations play significant role in many application and extraction them automatically is useful in NLP. Syntactic-based phrase patterns used in collocation extraction have brought advantages due to the well-formedness of results and automatically classifying the candidates into syntactically congeneric categories. However, due to the language independency, the arbitrary choice of syntactic patterns for target collocations brings drawbacks for evaluation as well as adaptation for a new language. This work presents a corpus-driven framework to generate collocation templates for nouns and verbs phrase at first and then integrate them with statistical association measures for noun/verb phrase collocation extraction, namely typed collocation extraction. The experiment results show a higher average precision of 84.80% and a so called local recall value of 55.99% based on a randomly selected noun and verb headwords.
Keywords :
natural language processing; statistical analysis; Chinese typed collocation extraction; collocation templates; corpus-based syntactic collocation patterns; phrase collocation extraction; statistical association measures; syntactic-based phrase patterns; syntactically congeneric categories; target collocations; Data mining; Entropy; Explosions; Extraterrestrial phenomena; Frequency; Pattern analysis; Pattern matching; Statistical analysis; Tagging; Testing;
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-1611-0
Electronic_ISBN :
978-1-4244-1611-0
DOI :
10.1109/NLPKE.2007.4368039