Title :
Patient specific predictions in the intensive care unit using a Bayesian ensemble
Author :
Johnson, Alistair E. W. ; Dunkley, N. ; Mayaud, Louis ; Tsanas, Athanasios ; Kramer, A.A. ; Clifford, G.D.
Author_Institution :
Univ. of Oxford, Oxford, UK
Abstract :
An intensive care unit mortality prediction model for the PhysioNet/Computing in Cardiology Challenge 2012 using a novel Bayesian ensemble learning algorithm is described. Methods: Data pre-processing was automatically performed based upon domain knowledge to remove artefacts and erroneous recordings, e.g. physiologically invalid entries and unit conversion errors. A range of diverse features was extracted from the original time series signals including standard statistical descriptors such as the minimum, maximum, median, first, last, and the number of values. A new Bayesian ensemble scheme comprising 500 weak learners was then developed to classify the data samples. Each weak learner was a decision tree of depth two, which randomly assigned an intercept and gradient to a randomly selected single feature. The parameters of the ensemble learner were determined using a custom Markov chain Monte Carlo sampler. Results: The model was trained using 4000 observations from the training set, and was evaluated by the organisers of the competition on two new datasets with 4000 observations each (set b and set c). The outcomes of the datasets were unavailable to the competitors. The competition was judged on two events by two scores. Score 1 was the minimum of the positive predictive value and sensitivity for binary model predictions, and the model achieved 0.5310 and 0.5353 on the unseen datasets. Score 2, a range-normalized Hosmer-Lemeshow C statistic, evaluated to 26.44 and 29.86. The model was re-developed using the updated data sets from phase 2 after the competition, and achieved a score 1 of 0.5374 and a score 2 of 18.20 on set c. Conclusion: The proposed prediction model performs favourably on both the provided and hidden data sets (set A and set B), and has the potential to be used effectively for patient-specific predictions.
Keywords :
Markov processes; Monte Carlo methods; belief networks; feature extraction; learning (artificial intelligence); medical computing; patient care; time series; Bayesian ensemble learning algorithm; binary model predictions; custom Markov chain Monte Carlo sampler; data preprocessing; data sets; decision tree; diverse feature extraction; domain knowledge; intensive care unit mortality prediction model; patient specific predictions; range-normalized Hosmer-Lemeshow C statistics; standard statistical descriptors; time series signals; Biomedical monitoring; Data models; Feature extraction; Predictive models; Training; Vegetation;
Conference_Titel :
Computing in Cardiology (CinC), 2012
Conference_Location :
Krakow
Print_ISBN :
978-1-4673-2076-4