Title :
An integrative network-driven pipeline for the prioritization of Alzheimer´s disease genes
Author :
Browne, Fiona ; Haiying Wang ; Huiru Zheng
Author_Institution :
Sch. of Comput. & Math., Univ. of Ulster, Newtownabbey, UK
Abstract :
Large-scale, high-throughput technologies and genome-wide studies have been pivotal in the identification of disease-gene candidates from patient cohorts. Output from these studies often result in gene candidate lists which are large in size. Therefore, there is a pressing need for computational tools to integrate heterogeneous data and prioritize disease-gene candidates for further experimental investigation. To address this need, we propose a computational pipeline for the prioritization of disease-gene candidates. Our pipeline integrates diverse heterogeneous data including: gene-expression, protein-protein interaction network, ontology-based similarity and betweenness measures. Furthermore, we incorporate tissue-specific gene expression data into the evaluation section of our approach. The pipeline was applied to prioritize Alzheimer´s Disease (AD) genes, whereby a list of 31 prioritized genes was generated. This approach correctly identified key AD susceptible genes: INPP5D and PSEN1. Biological process enrichment analysis revealed the prioritized genes are modulated in AD pathogenesis including: regulation of neurogenesis and generation of neurons. KEGG pathway analysis identified significant hub involvement in the Neurotrophin signaling and Huntington Disease pathways. Furthermore, our evaluation demonstrated a relatively high predictive performance (AUC: 0.73) when classifying AD and normal gene expression profiles from individuals using leave-one-out cross validation. This work provides a foundation for future investigation of diverse heterogeneous data integration for disease-gene prioritization.
Keywords :
biochemistry; bioinformatics; biological tissues; classification; data analysis; data integration; data mining; diseases; genetics; genomics; medical computing; medical disorders; molecular biophysics; molecular configurations; neurophysiology; ontologies (artificial intelligence); pattern matching; pipeline processing; proteins; sorting; AD gene expression profile classification; AD gene prioritization; AD pathogenesis; AD susceptible gene identification; Alzheimer disease gene prioritization; Huntington disease pathway; INPP5D; KEGG pathway analysis; PSEN1; betweenness measure; biological process enrichment analysis; computational pipeline; computational tool; disease-gene candidate identification; disease-gene candidate prioritization; evaluation section; gene candidate list size; genome-wide study; heterogeneous data integration; high-throughput technology; hub involvement identification; integrative network-driven pipeline; large-scale technology; leave-one-out cross validation; neurogenesis regulation; neurotrophin signaling; normal gene expression profile classification; ontology-based similarity; patient cohort; predictive performance; prioritized gene generation; prioritized gene modulation; protein-protein interaction network; tissue-specific gene expression data; Bioinformatics; Diseases; Gene expression; Neurons; Pipelines; Proteins; Semantics;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference on
Conference_Location :
Belfast
DOI :
10.1109/BIBM.2014.6999189