• DocumentCode
    49098
  • Title

    A Genomic Analysis Pipeline and Its Application to Pediatric Cancers

  • Author

    Zeller, Marcus ; Magnan, Christophe N. ; Patel, Vishal R. ; Rigor, Paul ; Sender, Leonard ; Baldi, Pierre

  • Author_Institution
    Dept. of Comput. Sci., Univ. of California, Irvine, Irvine, CA, USA
  • Volume
    11
  • Issue
    5
  • fYear
    2014
  • fDate
    Sept.-Oct. 1 2014
  • Firstpage
    826
  • Lastpage
    839
  • Abstract
    We present a cancer genomic analysis pipeline which takes as input sequencing reads for both germline and tumor genomes and outputs filtered lists of all genetic mutations in the form of short ranked list of the most affected genes in the tumor, using either the Complete Genomics or Illumina platforms. A novel reporting and ranking system has been developed that makes use of publicly available datasets and literature specific to each patient, including new methods for using publicly available expression data in the absence of proper control data. Previously implicated small and large variations (including gene fusions) are reported in addition to probable driver mutations. Relationships between cancer and the sequenced tumor genome are highlighted using a network-based approach that integrates known and predicted protein-protein, protein-TF, and protein-drug interaction data. By using an integrative approach, effects of genetic variations on gene expression are used to provide further evidence of driver mutations. This pipeline has been developed with the aim to be used in assisting in the analysis of pediatric tumors, as an unbiased and automated method for interpreting sequencing results along with identifying potentially therapeutic drugs and their targets. We present results that agree with previous literature and highlight specific findings in a few patients.
  • Keywords
    biology computing; cancer; genetics; genomics; hypermedia markup languages; molecular biophysics; paediatrics; proteins; tumours; Complete Genomics; Illumina platforms; cancer genomic analysis pipeline; driver mutations; gene expression; gene fusions; genetic mutations; genetic variations; genomic analysis pipeline; germline genomes; input sequencing; integrative approach; network-based approach; output filtered lists; pediatric cancers; pediatric tumors; probable driver mutations; protein-TF interaction data; protein-drug interaction data; protein-protein interaction data; publicly available datasets; publicly available expression data; ranking system; sequenced tumor genome; short-ranked list; therapeutic drugs; tumor genomes; Assembly; Bioinformatics; Cancer; Genomics; Proteins; Sequential analysis; Tumors; Pediatric cancer; genome analysis; next generation sequencing;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2330616
  • Filename
    6832550