Title :
Classification of pathology data using a probabilistic (Bayesian) model
Author :
Robin, Howard ; Eberhardt, John S., III ; Muller, Wayne D. ; Clark, Robert ; Kam, Jenny
Author_Institution :
Sharp Memorial Hosp., San Diego, CA, USA
Abstract :
Applying dynamic statistical modeling to clinical laboratory data can help the clinician make more effective use of this essential information. We applied a probabilistic (Bayesian) classifier to a set of 645 breast tumor samples for which we had extensive clinical and pathological information. The study included 645 formalin-fixed, paraffin embedded breast cancer samples observed from 1999 to 2003 and originally submitted to the Sharp Hospital system (San Diego, CA) for pathologic evaluation and breast predictive and prognostic studies. The data was analyzed with a Bayesian network utilizing DecisionQ Faster Analytics. The model eliminated age and biopsy site as predictive markers and identified relationships between Ki67 proliferative index, tumor size, and Combined Nottingham Histologic Grade (CNHG) as well as between CNHG and ploidy, differentiation, tumor type, and estrogen-progesterone status. We performed cross-validation analysis to statistically validate the structure of the model. We calculated ROC curves and the area under the curve ranged from 68.5% to 91.2% with a mean of 80.1%; the positive predictive value of the model ranged from 55.6% to 82.6% with a mean of 69.2%. Our Bayesian model supports decision-making in complex disease populations by illustrating key relationships and providing probability estimates. The structure of the network allows it to support different types of clinical decisions by a variety of medical specialists. The Bayesian analysis tool we have developed assists the clinician and researcher in using prior knowledge, outcomes experience, and pathology data to hopefully make more meaningful diagnostic and treatment decisions in an evidence-based validated framework.
Keywords :
Bayes methods; belief networks; cancer; classification; data analysis; medical diagnostic computing; medical information systems; probability; tumours; Bayesian network; Combined Nottingham Histologic Grade; DecisionQ Faster Analytics; Ki67 proliferative index; biopsy; breast predictive study; breast prognostic study; breast tumor samples; clinical decisions; clinical laboratory data; cross-validation analysis; data analysis; decision making; estrogen-progesterone status; paraffin embedded breast cancer samples; pathological information; pathology data classification; probabilistic Bayesian model; probability estimates; statistical modeling; tumor size; Bayesian methods; Biopsy; Breast cancer; Breast neoplasms; Breast tumors; Data analysis; Hospitals; Laboratories; Pathology; Predictive models;
Conference_Titel :
Systems Engineering, 2005. ICSEng 2005. 18th International Conference on
Print_ISBN :
0-7695-2359-5
DOI :
10.1109/ICSENG.2005.22