• DocumentCode
    2709379
  • Title

    Learning on Weighted Hypergraphs to Integrate Protein Interactions and Gene Expressions for Cancer Outcome Prediction

  • Author

    TaeHyun Hwang ; Ze Tian ; Rui Kuang ; Kocher, J.-P.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Univ. of Minnesota Twin Cities, Minneapolis, MN
  • fYear
    2008
  • fDate
    15-19 Dec. 2008
  • Firstpage
    293
  • Lastpage
    302
  • Abstract
    Building reliable predictive models from multiple complementary genomic data for cancer study is a crucial step towards successful cancer treatment and a full understanding of the underlying biological principles. To tackle this challenging data integration problem, we propose a hypergraph-based learning algorithm called HyperGene to integrate microarray gene expressions and protein-protein interactions for cancer outcome prediction and biomarker identification. HyperGene is a robust two-step iterative method that alternatively finds the optimal outcome prediction and the optimal weighting of the marker genes guided by a protein-protein interaction network. Under the hypothesis that cancer-related genes tend to interact with each other, the HyperGene algorithm uses a protein-protein interaction network as prior knowledge by imposing a consistent weighting of interacting genes. Our experimental results on two large-scale breast cancer gene expression datasets show that HyperGene utilizing a curated protein-protein interaction network achieves significantly improved cancer outcome prediction. Moreover, HyperGene can also retrieve many known cancer genes as highly weighted marker genes.
  • Keywords
    learning (artificial intelligence); medical computing; HyperGene; cancer outcome prediction; challenging data integration problem; gene expressions; hypergraph-based learning algorithm; multiple complementary genomic data; protein interactions; protein-protein interactions; reliable predictive models; weighted hypergraphs; Bioinformatics; Biomarkers; Cancer; Gene expression; Genomics; Iterative algorithms; Iterative methods; Predictive models; Proteins; Robustness; biomarker identification; cancer genomics; semi-supervised learning; spectral graph learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
  • Conference_Location
    Pisa
  • ISSN
    1550-4786
  • Print_ISBN
    978-0-7695-3502-9
  • Type

    conf

  • DOI
    10.1109/ICDM.2008.37
  • Filename
    4781124