DocumentCode
1909733
Title
Identification of Noun Phrase with Various Granularities
Author
Qin, Ying ; Wang, Xiaojie ; Zhong, Yixin
Author_Institution
Sch. of Inf. Eng., Beijing Univ. of Posts & Telecommun., Beijing
fYear
2007
fDate
Aug. 30 2007-Sept. 1 2007
Firstpage
197
Lastpage
202
Abstract
Since noun phrases are the most popular phrases in texts, noun phrase identification is one of vital subtasks of natural language processing. Generally Chinese noun phrases have hierarchical inner structures. This paper proposes an approach of defining various levels of granularity for noun phrases, catering for different application demands. Three levels of granularity noun phrases are proposed, that is, concept noun phrase, base noun phrase and entire noun phrase. The task of noun phrase identification is to label word sequences with phrase tags. All granularity noun phrase identifications are cast as classification problem under certain encoding schemes. The experimental dataset is acquired empirically from Chinese Penn Treebank 5.1. F, measure of concept noun phrase, base noun phrase and entire noun phrase identification reaches 92.12%, 84.13% and 85.32% respectively.
Keywords
encoding; grammars; natural language processing; pattern classification; text analysis; Chinese noun phrases; classification problem; encoding schemes; granularity noun phrase identification; natural language processing; phrase tags; text phrases; word sequence labelling; Data mining; Information retrieval; Morphology; Natural language processing; Natural languages; Sun; Tin;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2007. NLP-KE 2007. International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-1610-3
Electronic_ISBN
978-1-4244-1611-0
Type
conf
DOI
10.1109/NLPKE.2007.4368033
Filename
4368033
Link To Document