Diabetes Detection via Machine Learning: Tackling Challenges and Envisioning Future Innovations

Document Type : Original Article

Authors

1 Department of Mathematics and Computer Sciences, Faculty of Sciences, Port Said

2 Port Said University

3 faculty of computers and information, Damietta university , Egypt

Abstract

Diabetes is one of the most serious diseases globally, affecting millions of individuals worldwide. Scientists are working to reduce the prevalence and incidence of this condition. Therefore, extensive research in this field has sought to pinpoint the most effective techniques for predicting diabetes. Examples of previously used approaches for predicting diabetes include data mining (DM), deep learning (DL), and machine learning (ML). Researchers employ these techniques to forecast diabetes at early stages and mitigate its impact. Many ML algorithms have been utilized, such as Support Vector Machine(SVM), ordering points to identify the clustering structure (Optics), Random Forest (RF), Gradient Boosting(GB), Decision Tree (DT), K-Nearest Neighbors(KNN), Gaussian Naive Bayes (GNB), XGBoost, and Logistic Regression (LR). While some studies confirm the effectiveness of these methods, recent findings underscore the superior efficiency of neural networks and deep learning. In this paper, we compare eight ML algorithms (including an enhanced deep learning model) using confusion matrix analysis and accuracy performance for accurate diabetes prediction using three different datasets. Our findings indicate that the enhanced deep learning model demonstrates high performance of 84%, 93%, and 100% on three datasets, PID, Taipei, and German, respectively, outperforming all other evaluated machine learning algorithms in this paper.

Keywords

Main Subjects