DocumentCode :
2949905
Title :
BioWizard: Discovering and validating associations between biological entities by integrated analysis of scientific literature and experimental data
Author :
Spampinato, Concetto ; Giordano, Daniela ; Kavasidis, Isaak ; Milardo, Sebastiano
Author_Institution :
Dept. of Electr., Electron. & Comput. Eng., Univ. of Catania, Catania, Italy
fYear :
2012
fDate :
20-22 June 2012
Firstpage :
1
Lastpage :
6
Abstract :
In this paper, we present BioWizard, a bioinformatics knowledge discovery tool for extracting and validating implicit associations between biological entities. By mining specialized scientific literature, BioWizard not only generates biological hypotheses in the form of associations between genes, proteins and diseases, but also validates the plausibility of such associations against high-throughput biological data (microarrays) and annotated databases. The main novelties of the proposed approach are that: (1) it infers associations between biological entities by mining full text papers instead of only abstracts as usually performed by the existing tools, (2) a named entity recognition that improves the precision of the derived associations by enriching the vocabularies used in the mining loop with terms extracted directly from the text and, (3) the inferred associations are filtered according to their evidence in experimental data. We tested the precision and the recall of our system in retrieving known-associations (which did not appear in the same document) from gold standards and the results shown the ability of BioWizard in retrieving valid associations, thus providing a valuable tool for the use of biomedical researchers to speed up scientific progress.
Keywords :
bioinformatics; data mining; diseases; genetics; information retrieval; molecular biophysics; proteins; BioWizard; annotated databases; bioinformatics knowledge discovery tool; biological entities; biological hypothesis generation; diseases; experimental data; genes; high-throughput biological data; integrated analysis; known-association retrieval; microarray data; named entity recognition; proteins; scientific literature; text mining; valid association retrieval; Databases; Dictionaries; Diseases; Protein engineering; Proteins; Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer-Based Medical Systems (CBMS), 2012 25th International Symposium on
Conference_Location :
Rome
ISSN :
1063-7125
Print_ISBN :
978-1-4673-2049-8
Type :
conf
DOI :
10.1109/CBMS.2012.6266327
Filename :
6266327
Link To Document :
بازگشت