Title of article :
Predicting preeclampsia and related risk factors using data mining approaches: A cross-sectional study
Author/Authors :
Manoochehri ، Zohreh Department of Biostatistics - Student Research Committee - Hamadan University of Medical Sciences , Manoochehri ، Sara Department of Biostatistics - Student Research Committee - Hamadan University of Medical Sciences , Soltani ، Farzaneh Department of Midwifery - School of Nursing and Midwifery - Hamadan University of Medical Sciences , Tapak ، Leili Department of Biostatistics - Modeling of Noncommunicable Disease Research Center, School of Public Health - Hamadan University of Medical Sciences , Sadeghifar ، Majid Department of Statistics - Faculty of Basic Sciences - Bu-Ali Sina University
From page :
959
To page :
968
Abstract :
Background: Preeclampsia is a type of pregnancy hypertension disorder that has adverse effects on both the mother and the fetus. Despite recent advances in the etiology of preeclampsia, no adequate clinical screening tests have been identified to diagnose the disorder. Objective: We aimed to provide a model based on data mining approaches that can be used as a screening tool to identify patients with this syndrome and also to identify the risk factors associated with it. Materials and Methods: The data used to perform this cross-sectional study were extracted from the clinical records of 726 mothers with preeclampsia and 726 mothers without preeclampsia who were referred to Fatemieh Hospital in Hamadan City during April 2005–March 2015. In this study, six data mining methods were adopted, including logistic regression, k-nearest neighborhood, C5.0 decision tree, discriminant analysis, random forest, and support vector machine, and their performance was compared using the criteria of accuracy, sensitivity, and specificity. Results: Underlying condition, age, pregnancy season and the number of pregnancies were the most important risk factors for diagnosing preeclampsia. The accuracy of the models were as follows: logistic regression (0.713), k-nearest neighborhood (0.742), C5.0 decision tree (0.788), discriminant analysis (0.687), random forest (0.758) and support vector machine (0.791). Conclusion: Among the data mining methods employed in this study, support vector machine was the most accurate in predicting preeclampsia. Therefore, this model can be considered as a screening tool to diagnose this disorder.
Keywords :
Preeclampsia , Random forest , C5.0 decision tree , Support vector machine , Logistic regression.
Journal title :
International Journal of Reproductive BioMedicine
Journal title :
International Journal of Reproductive BioMedicine
Record number :
2711669
Link To Document :
بازگشت