Title :
Network-based mutation analysis of putative cancer genes from next-generation sequencing data
Author :
Peilin Jia ; Zhongming Zhao
Author_Institution :
Sch. of Med., Dept. of Biomed. Inf., Vanderbilt Univ., Nashville, TN, USA
Abstract :
Next-generation sequencing (NGS) has enabled fast detection of somatic mutations in cancer genomes. A major challenge in interpreting the large volume of mutation data is to distinguish driver mutations from neutral passenger mutations. Current approaches are primarily single-gene based prioritization according to mutation frequencies, which harbors both high false positive and false negative discoveries. We propose a novel network-based method of mutation data for driver gene prioritization from large scale mutation data for cancer. Our method takes into consideration of the mutation profile of each patient by fitting sample-specific generalized additive models. It builds on joint frequency of both mutation genes and their close interactors, which are optimized by the algorithm Random Walk with Restart in a protein-protein interaction network. We demonstrated our method in two large-scale NGS datasets: a lung adenocarcinoma (LUAD) dataset including 183 patients and a melanoma dataset including 121 samples. In each cancer, we derived a consensus mutation subnetwork with significantly enriched consensus cancer genes and cancer-related functional pathways. The LUAD subnetwork recruited 70 genes of the Cancer Gene Census (CGC) collection (p-value <; 2.2×10-16, Fisher´s Exact Test) and the melanoma subnetwork included 65 CGC genes (p-value <; 2.2x10-16). In addition, our results indicate that some well-known, infrequently mutated genes, which have been ignored by conventional single-gene based approaches, are also prioritized and are shown to interact with those highly recurrently mutated genes. In sum, our method is effective in prioritizing candidate driver genes from more than ten thousand mutation genes and provides biological interpretations for future work.
Keywords :
cancer; genetics; genomics; lung; molecular biophysics; proteins; random processes; CGC genes; LUAD dataset; LUAD subnetwork; biological interpretations; cancer gene census collection; cancer genomes; cancer-related functional pathways; consensus cancer genes; consensus mutation subnetwork; fitting sample-specific generalized additive models; gene prioritization; joint frequency; large-scale NGS datasets; lung adenocarcinoma dataset; melanoma dataset; mutation genes; network-based mutation analysis; neutral passenger mutations; next-generation sequencing data; primarily single-gene based prioritization; protein-protein interaction network; putative cancer genes; random walk algorithm; somatic mutation detection; Cancer; Next generation networking; Sequential analysis;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2013 IEEE International Conference on
Conference_Location :
Shanghai
DOI :
10.1109/BIBM.2013.6732754