Title :
A Genomic Analysis Pipeline and Its Application to Pediatric Cancers
Author :
Zeller, Marcus ; Magnan, Christophe N. ; Patel, Vishal R. ; Rigor, Paul ; Sender, Leonard ; Baldi, Pierre
Author_Institution :
Dept. of Comput. Sci., Univ. of California, Irvine, Irvine, CA, USA
fDate :
Sept.-Oct. 1 2014
Abstract :
We present a cancer genomic analysis pipeline which takes as input sequencing reads for both germline and tumor genomes and outputs filtered lists of all genetic mutations in the form of short ranked list of the most affected genes in the tumor, using either the Complete Genomics or Illumina platforms. A novel reporting and ranking system has been developed that makes use of publicly available datasets and literature specific to each patient, including new methods for using publicly available expression data in the absence of proper control data. Previously implicated small and large variations (including gene fusions) are reported in addition to probable driver mutations. Relationships between cancer and the sequenced tumor genome are highlighted using a network-based approach that integrates known and predicted protein-protein, protein-TF, and protein-drug interaction data. By using an integrative approach, effects of genetic variations on gene expression are used to provide further evidence of driver mutations. This pipeline has been developed with the aim to be used in assisting in the analysis of pediatric tumors, as an unbiased and automated method for interpreting sequencing results along with identifying potentially therapeutic drugs and their targets. We present results that agree with previous literature and highlight specific findings in a few patients.
Keywords :
biology computing; cancer; genetics; genomics; hypermedia markup languages; molecular biophysics; paediatrics; proteins; tumours; Complete Genomics; Illumina platforms; cancer genomic analysis pipeline; driver mutations; gene expression; gene fusions; genetic mutations; genetic variations; genomic analysis pipeline; germline genomes; input sequencing; integrative approach; network-based approach; output filtered lists; pediatric cancers; pediatric tumors; probable driver mutations; protein-TF interaction data; protein-drug interaction data; protein-protein interaction data; publicly available datasets; publicly available expression data; ranking system; sequenced tumor genome; short-ranked list; therapeutic drugs; tumor genomes; Assembly; Bioinformatics; Cancer; Genomics; Proteins; Sequential analysis; Tumors; Pediatric cancer; genome analysis; next generation sequencing;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2014.2330616