• DocumentCode
    2155324
  • Title

    Caffeine Intake, Race, and Risk of Invasive Breast Cancer Lessons Learned from Data Mining a Clinical Database

  • Author

    Maskery, Susan ; Zhang, Yonghong ; Hu, Hai ; Shriver, Craig ; Hooke, Jeffrey ; Liebman, Michael

  • Author_Institution
    Windber Res. Inst., PA
  • fYear
    0
  • fDate
    0-0 0
  • Firstpage
    714
  • Lastpage
    718
  • Abstract
    Over the past five years the Clinical Breast Care Project (CBCP) has amassed a significant patient database and tissue repository related to breast disease and breast cancer. We have begun mining this unique data source (i.e. life history questionnaire data, pathology reports, analysis of blood and tissue samples) to examine interactions between known risk factors for breast cancer development (i.e. menopausal status, parity, etc.) with breast disease and cancer incidence in our patient population. From these initial forays into analyzing the CBCP´s data repository, we have begun to develop protocols for data mining. In particular, a crucial first step is to quantify interactions between variables of interest prior to any specific significance tests relating individual variables to risk of a clinical result. For this purpose, we find Bayesian network analysis the most useful method for exploration of data interactions. To illustrate this point, this abstract details an investigation into the effect of caffeine consumption on breast cancer incidence in our CBCP population. Based on our experience with this and other studies we strongly recommend Bayesian network analysis of all variables of interest as an initial data exploration tool
  • Keywords
    Bayes methods; biological organs; blood; cancer; data mining; mammography; medical information systems; Bayesian network analysis; Clinical Breast Care Project; blood analysis; caffeine intake; clinical database; data exploration; data mining; data repository; invasive breast cancer; life history questionnaire data; menopausal status; parity; pathology reports; patient database; race; tissue repository; tissue sample analysis; Bayesian methods; Blood; Breast cancer; Data analysis; Data mining; Databases; Diseases; History; Pathology; Risk analysis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer-Based Medical Systems, 2006. CBMS 2006. 19th IEEE International Symposium on
  • Conference_Location
    Salt Lake City, UT
  • ISSN
    1063-7125
  • Print_ISBN
    0-7695-2517-1
  • Type

    conf

  • DOI
    10.1109/CBMS.2006.64
  • Filename
    1647655