Main Article Content
Abstract
Diabetes, a chronic disease that is becoming more prevalent, presents increasing challenges, especially in low- and middle-income countries, where it is a growing burden. Egypt is the 9th most prevalent country for diabetes in the world, with estimated diabetes prevalence among adults at 15.2%, which raises urgent implications for early detection to limit complications including retinopathy, renal impairment and limb amputation. This study proposes a method to address classification of Type 2 diabetes (T2DM) through implementing and exploring the application of five machine learning algorithms: support vector machine (SVM), naïve Bayes (NB), K-Nearest Neighbor (KNN), Bayesian network (BNC) and stochastic gradient descent (SGD), along with CHAID algorithm to produce conditional segmentation variable to model non-linear interactions while improving expressivity of features used. CHAID analyses found that the best predictor of T2DM involved high levels of the hemoglobin A1c, and insulin resistance. The next best predictors were triglycerides and then followed by age, obesity, and blood pressure. Effects from the metabolic, cardiovascular, and lifestyle variables were small-to-moderate showing a significant amount of clustering. The hybrid model was developed as protection against overfitting, thus allowing robust and generalizable classification performance. The proposed hybrid models outperformed that of a single model. Specifically, SVM via CHAID and SGD via CHAID were able to obtain a perfect classification accuracy of 100% revealing the model's potential as a powerful tool for early detection and examination of risk of diabetes.
Keywords
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following License
CC BY: This license allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.