Title :
Coreference resolution in biomedical full-text articles with domain dependent features
Author :
Huang, Cuili ; Wang, Yaqiang ; Zhang, Yongmei ; Jin, Yu ; Yu, Zhonghua
Author_Institution :
Coll. of Comput. Sci., Sichuan Univ., Chengdu, China
Abstract :
Coreference resolution is one of the most significant and difficult tasks in text mining. This paper proposes an algorithm to find coreference relations in biomedical papers. We concentrate on noun phrases referring to biomedical entities and do our research on biomedical full-text articles instead of paper abstract. The algorithm not only uses features about linguistic but also introduces features derived from biomedical domain knowledge to deeply reveal the coreference relation between biomedical noun phrases. These features are employed to a probabilistic-based classifier which takes the dependence of features into account. Experiments show that the features are helpful and the algorithm is effective for coreference resolution in the biomedical domain. On a full-text corpus with anaphoric links, a satisfactory experimental result (76.2% precision, 66.0% recall and 70.7% F-Measure) is achieved by the proposed algorithm. Due to appropriate features, there is 13.8% improvement yielded against the former algorithm.
Keywords :
data mining; medical administrative data processing; pattern classification; probability; text analysis; biomedical domain knowledge; biomedical full-text article; biomedical noun phrase; coreference resolution; domain dependent feature; probabilistic based classifier; text mining; Biology; Biomedical measurements; Equations; Semantics; coreference resolution; domain dependent features; full-text articles; gene name;
Conference_Titel :
Computer Technology and Development (ICCTD), 2010 2nd International Conference on
Conference_Location :
Cairo
Print_ISBN :
978-1-4244-8844-5
Electronic_ISBN :
978-1-4244-8845-2
DOI :
10.1109/ICCTD.2010.5645973