مرکز منطقه ای اطلاع رساني علوم و فناوري - Derivation and validation of a bayesian network to predict pretest probability of venous thromboembolism

Abstract :

Study objectives: Bayesian networks analyze interdependencies between predictive variables. A genetic algorithm can be constructed that maps the directions and magnitudes of these dependencies to produce a framework to estimate probability of the value of any one of the variables according to the values of the other variables. We derive and test the diagnostic accuracy a Bayesian model designed to assess the conditional probability of venous thromboembolism (VTE). Methods: A genetic algorithm was used to construct a Bayesian network that was manipulated using commercial software (Netica). The fitness function was derived from analysis of the interrelationships of 26 clinical variables prospectively collected on 3,148 emergency department (ED) patients originating from 10 hospitals. All patients underwent standardized testing, including pulmonary vascular imaging. A positive endpoint (VTE+) was anticoagulation or vena caval interruption within 90 days. Eleven percent of derivation subjects were VTE+. The best-fit model was selected using a population of 75 competing models that were updated over 50 generations of the genetic algorithm search process. The resultant model was tested in a validation population of 1,427 ED patients who were evaluated for VTE at 2 hospitals from 2001 to 2003. All 1,427 had the same 26 variables prospectively collected using the same explicit variable definitions that were used in the derivation set. The 90-day VTE+ rate was 8% in the validation population. For statistical analysis, the model output was normalized to a score from 0 to 100 points, and diagnostic accuracy was assessed by area under the receiver operating characteristic (ROC) curve. Results: When tested in the derivation set, the Bayesian model produced an area under the ROC curve of 0.78 (95% confidence interval [CI] 0.75 to 0.80). A score less than 3 generated the lowest likelihood ratio negative (0.07). When the model was tested in the validation population, the area under the ROC curve was 0.83 (95% CI 0.79 to 0.87), indicating good overall diagnostic accuracy. In the validation population, 711 of 1,427 (50%; 95% CI 47% to 52%) patients had a score less than 3 (test negative), which yielded a sensitivity of 90.3%, a specificity of 54.0%, and a pretest probability of 11 of 711 (1.5%; 95% CI 0 to 3.0%). Conclusion: A Bayesian network derived from a large multicenter sample demonstrated good diagnostic performance in a large, 2-center validation population. Bayesian network analysis represents a promising methodology to provide clinicians with an accurate point estimate of pretest probability for VTE.