Title :
Quantitative measurement of clinic-genomic association for colorectal cancer using literature mining and Google-distance algorithm
Author :
Yaning Feng ; Ling Zheng ; Liying Song ; Huilong Duan ; Ning Deng
Author_Institution :
Key Lab. for Biomed. Eng., Zhejiang Univ., Hangzhou, China
Abstract :
Nowadays, a growing number of researchers devote themselves to re-excavation of existing biomedical knowledge discovery, focusing on how to establish associations between clinical and genomic data. However, quantitative analysis is still inadequate for a particular disease. Colorectal cancer is the one of malignant tumors whose molecular mechanism is relatively clear, making it a more appropriate object of study. This paper proposed a quantitative measurement of clinic-genomic associations for colorectal cancer based on Google Distance, using MEDLINE database as the corpus. Our method is engineered with several technologies, including mapping clinic and genomic data to MeSH terms, modifying Normalized Google Distance using year average. Data from Electronic Medical Records (EMR), Online Mendelian Inheritance in Man (OMIM), and Genetic Association Database (GAD) were used in this study. A total of 3795 clinic-genomic associations of colorectal cancer between 67 clinical concepts and 236 genes were obtained, of which 584 associations were identified for their gene is contained in the colorectal cancer pathway using KEGG pathway analysis. Assessment and interpretation were conducted using KEGG, GeneCards, and then getting new discoveries. This method is valid in quantitative analysis using biomedical literature and achieves a good performance in measuring the clinical data and genomic data, which can be transplanted to other disease research.
Keywords :
associative processing; bioinformatics; biological organs; cancer; data analysis; data mining; electronic health records; genetics; genomics; medical computing; text analysis; tumours; EMR data; GAD data; GeneCards; Genetic Association Database data; Google distance algorithm; KEGG pathway analysis; MEDLINE database; MeSH term; OMIM data; Online Mendelian Inheritance in Man data; biomedical knowledge discovery; biomedical literature mining; clinic data mapping; clinic-genomic association identification; clinic-genomic association measurement; clinical concept association; clinical data-genomic data association; colorectal cancer pathway; disease research; electronic medical record data; gene association; genomic data mapping; malignant tumor; molecular mechanism; normalized Google distance modification; quantitative analysis; quantitative measurement; year average; Bioinformatics; Cancer; Databases; Diseases; Genomics; Google; PubMed; association mining; colorectal cancer; google distance; literature mining;
Conference_Titel :
Biomedical Engineering and Informatics (BMEI), 2014 7th International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-1-4799-5837-5
DOI :
10.1109/BMEI.2014.7002870