TY - JOUR
T1 - Predictive Modeling for the Diagnosis of Gestational Diabetes Mellitus Using Epidemiological Data in the United Arab Emirates
AU - Ali, Nasloon
AU - Khan, Wasif
AU - Ahmad, Amir
AU - Masud, Mohammad Mehedy
AU - Adam, Hiba
AU - Ahmed, Luai A.
N1 - © 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/)
PY - 2022/10
Y1 - 2022/10
N2 - Gestational diabetes mellitus (GDM) is a common condition with repercussions for both the mother and her child. Machine learning (ML) modeling techniques were proposed to predict the risk of several medical outcomes. A systematic evaluation of the predictive capacity of maternal factors resulting in GDM in the UAE is warranted. Data on a total of 3858 women who gave birth and had information on their GDM status in a birth cohort were used to fit the GDM risk prediction model. Information used for the predictive modeling were from self-reported epidemiological data collected at early gestation. Three different ML models, random forest (RF), gradient boosting model (GBM), and extreme gradient boosting (XGBoost), were used to predict GDM. Furthermore, to provide local interpretation of each feature in GDM diagnosis, features were studied using Shapley additive explanations (SHAP). Results obtained using ML models show that XGBoost, which achieved an AUC of 0.77, performed better compared to RF and GBM. Individual feature importance using SHAP value and the XGBoost model show that previous GDM diagnosis, maternal age, body mass index, and gravidity play a vital role in GDM diagnosis. ML models using self-reported epidemiological data are useful and feasible in prediction models for GDM diagnosis amongst pregnant women. Such data should be periodically collected at early pregnancy for health professionals to intervene at earlier stages to prevent adverse outcomes in pregnancy and delivery. The XGBoost algorithm was the optimal model for identifying the features that predict GDM diagnosis.
AB - Gestational diabetes mellitus (GDM) is a common condition with repercussions for both the mother and her child. Machine learning (ML) modeling techniques were proposed to predict the risk of several medical outcomes. A systematic evaluation of the predictive capacity of maternal factors resulting in GDM in the UAE is warranted. Data on a total of 3858 women who gave birth and had information on their GDM status in a birth cohort were used to fit the GDM risk prediction model. Information used for the predictive modeling were from self-reported epidemiological data collected at early gestation. Three different ML models, random forest (RF), gradient boosting model (GBM), and extreme gradient boosting (XGBoost), were used to predict GDM. Furthermore, to provide local interpretation of each feature in GDM diagnosis, features were studied using Shapley additive explanations (SHAP). Results obtained using ML models show that XGBoost, which achieved an AUC of 0.77, performed better compared to RF and GBM. Individual feature importance using SHAP value and the XGBoost model show that previous GDM diagnosis, maternal age, body mass index, and gravidity play a vital role in GDM diagnosis. ML models using self-reported epidemiological data are useful and feasible in prediction models for GDM diagnosis amongst pregnant women. Such data should be periodically collected at early pregnancy for health professionals to intervene at earlier stages to prevent adverse outcomes in pregnancy and delivery. The XGBoost algorithm was the optimal model for identifying the features that predict GDM diagnosis.
KW - gestational diabetes mellitus
KW - machine learning
KW - prediction modeling
KW - United Arab Emirates
UR - http://www.scopus.com/inward/record.url?scp=85140629637&partnerID=8YFLogxK
U2 - 10.3390/info13100485
DO - 10.3390/info13100485
M3 - Article
VL - 13
JO - Information
JF - Information
IS - 10
ER -