• DocumentCode
    1136746
  • Title

    Data-driven approach to predict survival of cancer patients

  • Author

    Motakis, Efthimios ; Ivshina, Anna V. ; Kuznetsov, Vladimir A.

  • Author_Institution
    Biopolis, Biolnformatics Inst., Singapore, Singapore
  • Volume
    28
  • Issue
    4
  • fYear
    2009
  • Firstpage
    58
  • Lastpage
    66
  • Abstract
    We develop a novel method to identify patients ´with different disease risk level. Our method estimates the optimal partition (cutoff) of a single gene´s expression level by maximizing the separation of the survival curves related to the high- and low risk of the disease behavior. We extend our approach to construct two-gene signatures, which can exhibit synergetic influence on patient survival. Using bootstrapping and statistical modeling, we evaluate the performance of our method by analyzing Affymetrix U133 data sets of two large breast cancer patient cohorts. Using 232-grade signature genes associated with different aggressiveness of breast tumor, we reveal a large number of gene pairs, which provides pronounced synergetic effect on patient´s survival time and identifies patients with low- and high-risk disease subtypes. The selected survival significant genes are strongly supported by gene ontology (GO) analysis and literature data. Specifically, for the first time, we demonstrate that cyclin A2 or cyclin A and protein tyrosine phosphatase T (CCNA2- PTPRT) and megalin (LRP2)-integrin alpha-7 (ITGA7) gene pairs can provide strong clinically significant interaction effects on the survival of breast cancer patients. Our technique has the potential to be a powerful tool for classification, prediction, and prognosis of cancer and other complex diseases.
  • Keywords
    bootstrapping; cancer; genetics; medical computing; patient diagnosis; risk analysis; Affymetrix U133 data sets; bootstrapping; breast cancer; cancer patient survival prediction; data driven approach; disease risk level; gene expression level; gene ontology; statistical modeling; Bioinformatics; Breast cancer; Breast tumors; Data analysis; Diseases; Gene expression; Hazards; Medical treatment; Neoplasms; Performance analysis; Algorithms; Breast Neoplasms; Cell Cycle; Cohort Studies; Databases, Genetic; Female; Gene Expression Profiling; Humans; Kaplan-Meiers Estimate; Models, Genetic; Oligonucleotide Array Sequence Analysis; Prognosis; Proportional Hazards Models; Regression Analysis; Reproducibility of Results;
  • fLanguage
    English
  • Journal_Title
    Engineering in Medicine and Biology Magazine, IEEE
  • Publisher
    ieee
  • ISSN
    0739-5175
  • Type

    jour

  • DOI
    10.1109/MEMB.2009.932937
  • Filename
    5165226