DocumentCode
2530686
Title
A Hybrid Abbreviation Extraction Technique for Biomedical Literature
Author
Song, Min ; Yoo, Illhoi
Author_Institution
New Jersey Inst. of Technol. Univ., Newark
fYear
2007
fDate
2-4 Nov. 2007
Firstpage
42
Lastpage
47
Abstract
In this paper, we propose a novel technique to extract abbreviation combining natural language processing techniques and the Support Vector Machine (SVM) in biomedical literature. The proposed technique gives us the comparative advantages over others in the following aspects: 1) It incorporates lexical analysis techniques to supervised learning for extracting abbreviations. 2) It makes use of text chunking techniques to identify long forms of abbreviations. 3) It significantly improves Recall compared to other techniques. The experimental results show that our approach outperforms the leading abbreviation algorithms, Extract Abbrev, ALICE, and Acrophile, at least by 6% 13.9%, and 13.2% respectively, in both Precision and Recall on the Gold Standard Development corpus.
Keywords
information retrieval; medical administrative data processing; medical computing; natural language processing; support vector machines; text analysis; vocabulary; Gold Standard Development corpus; biomedical literature; hybrid abbreviation extraction technique; lexical analysis; natural language processing; supervised learning; support vector machine; text chunking; Abstracts; Bioinformatics; Biomembranes; Conference management; Data mining; Natural language processing; Proteins; Supervised learning; Support vector machines; Technology management;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and Biomedicine, 2007. BIBM 2007. IEEE International Conference on
Conference_Location
Fremont, CA
Print_ISBN
978-0-7695-3031-4
Type
conf
DOI
10.1109/BIBM.2007.33
Filename
4413035
Link To Document