DocumentCode :
254225
Title :
Biomedical text mining for concept identification from traditional medicine literature
Author :
Javed, Z. ; Afzal, H.
Author_Institution :
Dept. of Comput. Software Eng., Nat. Univ. of Sci. & Technol., Islamabad, Pakistan
fYear :
2014
fDate :
18-20 Dec. 2014
Firstpage :
206
Lastpage :
211
Abstract :
In recent years, vast amount of biomedical literature is produced and published. Recent developments in biomedical text mining shows potential for supporting scientists in understanding new information from the existing biomedical literature because volume of electronically available biomedical literature are increasing massively. Automated literature mining offers one opportunity to discover different entities from literature. Web Technologies allow these entities to be stores and publish in the form to the further reuse by the researchers. The approach presented here includes text mining methodologies to automatically extract different entities from biomedical text. For this purpose biomedical articles based on Traditional Chinese medicine are extracted from Bio Med Central and Pub Med Central and used as corpus. Using text mining techniques of tokenization, splitting, stemming, lemmatization, parsing, named entity recognition are used for preprocessing of corpus. Candidate terms are identified by applying C-Value algorithm. These candidate terms and existing Seed/Ontological Terms are tagged in corpus. Using lexical and contextual profiles comparison between candidate terms and already existed Seed/Ontological Terms, we have identified new concepts. Identified concepts are evaluated.
Keywords :
Internet; data mining; medical information systems; ontologies (artificial intelligence); text analysis; C-value algorithm; Web technology; automated literature mining; biomedical articles; biomedical literature; biomedical text mining; candidate terms; concept identification; contextual profiles; corpus preprocessing; lemmatization; lexical profiles; named entity recognition; parsing; seed-ontological terms; splitting; stemming; tokenization; traditional Chinese medicine; Biomedical measurement; Blogs; Computers; Diseases; Media; Vocabulary; Writing; Disease Names Recognition; Named Entity Recognition; Ontological Terms; Seed Terms; Term Classification; Text Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Open Source Systems and Technologies (ICOSST), 2014 International Conference on
Conference_Location :
Lahore
Print_ISBN :
978-1-4799-2053-2
Type :
conf
DOI :
10.1109/ICOSST.2014.7029345
Filename :
7029345
Link To Document :
بازگشت