DocumentCode :
3165315
Title :
A Cascaded Approach to Biomedical Named Entity Recognition Using a Unified Model
Author :
Chan, Shing-Kit ; Lam, Wai ; Yu, Xiaofeng
Author_Institution :
Chinese Univ. of Hong Kong, Hongkong
fYear :
2007
fDate :
28-31 Oct. 2007
Firstpage :
93
Lastpage :
102
Abstract :
We propose a cascaded approach for extracting biomedical named entities from text documents using a unified model. Previous works often ignore the high computational cost incurred by a single-phase approach. We alleviate this problem by dividing the named entity extraction task into a segmentation task and a classification task, reducing the computational cost by an order of magnitude. A unified model, which we term "maximum-entropy margin-based" (MEMB), is used in both tasks. The MEMB model considers the error between a correct and an incorrect output during training and helps improve the performance of extracting sparse entity types that occur in biomedical literature. We report experimental evaluations on the GENIA corpus available from the BioNLP/NLPBA (2004) shared task, which demonstrate the state-of-the-art performance achieved by the proposed approach.
Keywords :
document handling; information retrieval; learning (artificial intelligence); medical computing; pattern classification; biomedical named entity recognition; classification task; entity extraction problem; maximum-entropy margin-based model; segmentation task; supervised learning; text documents; Biomedical engineering; Computational efficiency; Data engineering; Data mining; Databases; Error correction; RNA; Research and development management; Systems engineering and theory; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2007. ICDM 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
ISSN :
1550-4786
Print_ISBN :
978-0-7695-3018-5
Type :
conf
DOI :
10.1109/ICDM.2007.20
Filename :
4470233
Link To Document :
بازگشت