Main Article Content

Abstract

Diabetes, a chronic disease that is becoming more prevalent, presents increasing challenges, especially in low- and middle-income countries, where it is a growing burden. Egypt is the 9th most prevalent country for diabetes in the world, with estimated diabetes prevalence among adults at 15.2%, which raises urgent implications for early detection to limit complications including retinopathy, renal impairment and limb amputation. This study proposes a method to address classification of Type 2 diabetes (T2DM) through implementing and exploring the application of five machine learning algorithms: support vector machine (SVM), naïve Bayes (NB), K-Nearest Neighbor (KNN), Bayesian network (BNC) and stochastic gradient descent (SGD), along with CHAID algorithm to produce conditional segmentation variable to model non-linear interactions while improving expressivity of features used. CHAID analyses found that the best predictor of T2DM involved high levels of the hemoglobin A1c, and insulin resistance. The next best predictors were triglycerides and then followed by age, obesity, and blood pressure. Effects from the metabolic, cardiovascular, and lifestyle variables were small-to-moderate showing a significant amount of clustering. The hybrid model was developed as protection against overfitting, thus allowing robust and generalizable classification performance. The proposed hybrid models outperformed that of a single model. Specifically, SVM via CHAID and SGD via CHAID were able to obtain a perfect classification accuracy of 100% revealing the model's potential as a powerful tool for early detection and examination of risk of diabetes.

Keywords

CHAID Algorithm Diabetes, Support Vector Machine Naïve Bayes, K-Nearest Neighbor Bayesian Network Stochastic Gradient Descent

Article Details

How to Cite
Shallan, W. M. E. M. (2025). Hybrid Approach Based on the CHAID Algorithm for Improving Classification Performance of Diabetes Data. Pakistan Journal of Statistics and Operation Research, 21(4), 531-544. https://doi.org/10.18187/pjsor.v21i4.4771

DB Error: Unknown column 'Array' in 'WHERE'