Title :
Distributed Privacy Preserving Decision Support System for Predicting Hospitalization Risk in Hospitals with Insufficient Data
Author :
Mathew, George ; Obradovic, Z.
Author_Institution :
Center for Data Analytics & Biomed. Inf., Temple Univ., Philadelphia, PA, USA
Abstract :
Building prediction models for suggestive knowledge from multiple sources dynamically is of great interest from a clinical decision support point of view. This is valuable in situations where the local clinical data repository does not have sufficient number of records to draw conclusions from. However, due to privacy concerns, hospitals are reluctant to divulge patient records. Consequently, a distributed model building mechanism that can use just the statistics from multiple hospitals´ databases is valuable. Our DIDT algorithm builds a model in that fashion. In this study, using National Inpatient Sample (NIS) data for 2009, we demonstrate that DIDT algorithm can be used to help collaboratively build a better decision-making model in situations where hospitals have small number of records that are insufficient to make good local models. Based on 262 attributes used for model building, we showed that 9 collaborating hospitals each with less than 100 cases of hospitalizations related to diabetes were able to achieve 9.9% improvement in accuracies of hospitalization prediction collectively using a distributed model as compared to relying on local models developed on their own. When relying on local risk prediction models for diabetes at these 9 hospitals, 159 of 357 patients were misclassified and prediction was impossible for another 16 patients. Our integrated model reduced the misclassification to 138 effectively providing accurate early diagnostics to 37 additional patients. We also introduce the concept of banding to improve DIDT algorithm so as to logically combine multiple hospitals when large number of hospitals is involved for reduction in cross-validation folds.
Keywords :
data privacy; decision support systems; medical information systems; risk analysis; DIDT algorithm; NIS; clinical data repository; clinical decision support; cross-validation folds; decision-making model; distributed model building mechanism; distributed privacy preserving decision system; hospitalization risk prediction; national inpatient sample; patient records; privacy concerns; Accuracy; Buildings; Data models; Diabetes; Distributed databases; Hospitals; Predictive models; distributed decision making; hospitalization risk prediction; privacy preserving prediction model;
Conference_Titel :
Machine Learning and Applications (ICMLA), 2012 11th International Conference on
Conference_Location :
Boca Raton, FL
Print_ISBN :
978-1-4673-4651-1
DOI :
10.1109/ICMLA.2012.180