Title :
Discovering Gene Expression Data from the Tables of Full Text Publications
Author :
Mathiak, Brigitte ; Kupfer, Andreas ; Bartulos, Carolina Rio ; Scope, Tatjana ; Weiland, Johann ; Eckstein, Silke
Author_Institution :
Inst. fur Informationssysteme, Braunschweig
Abstract :
Finding out which genes are expressed in which circumstances is one of the most common tasks in text mining for bioinformatics. But usually the derived data comes from the abstract or other describing texts in the literature. In the age of modern high-throughput microarray analysis, however, there is too much data to be described textual; instead this data often comes in form of tables. In this paper, we are looking specifically at the tables, an approach to our knowledge never described before. The goal is to attach gene names found in tables to their context for a convenient literature review. In order to do so, matching literature has to be downloaded and pre-processed. After that has been done, gene names or protein names can be found through a fast and reliable search, presenting all the associated literature at a glance.
Keywords :
biology computing; data mining; genetics; proteins; text analysis; bioinformatics; full text publications; gene expression data discovery; gene names; microarray analysis; protein names; tabular data; text mining; textual data; Bioinformatics; Biomedical optical imaging; Data mining; Diabetes; Gene expression; Natural languages; Optical character recognition software; Particle separators; Proteins; Text mining;
Conference_Titel :
Data Mining Workshops, 2007. ICDM Workshops 2007. Seventh IEEE International Conference on
Conference_Location :
Omaha, NE
Print_ISBN :
978-0-7695-3019-2
Electronic_ISBN :
978-0-7695-3033-8
DOI :
10.1109/ICDMW.2007.29