Skip to main content

Table 3 compares imbalanced data handling techniques using accuracy and Area under the curve (AUC)

From: Machine learning to predict virological failure among HIV patients on antiretroviral therapy in the University of Gondar Comprehensive and Specialized Hospital, in Amhara Region, Ethiopia, 2022

Algorithms

Comparison method

Unbalanced

SMOTE

Under-Sampling

ADASYN

Logistic Regression

Accuracy (%)

97.23

95.11

82.94

94.40

AUC

0.915

0.987

0.913

0.986

 K Nearest Neighbours

Accuracy (%)

97.34

95.54

81.38

93.20

AUC

0.705

0.986

0.881

0.973

Decision Tree

Accuracy (%)

95.82

98.08

81.83

93.34

AUC

0.619

0.982

0.817

0.935

Random Forest

Accuracy (%)

97.42

98.80

83.34

95.10

AUC

0.892

0.999

0.917

0.991

Gradient Boosting

Accuracy (%)

97.17

95.22

84.52

93.06

AUC

0.903

0.988

0.912

0.982

XGBoost

Accuracy (%)

96.98

95.25

82.58

96.24

AUC

0.870

0.997

0.901

0.994

Support Vector Machine

Accuracy (%)

97.32

96.99

84.60

95.19

AUC

0.867

0.995

0.940

0.991

  1. AUC Area Under Curve, SMOTE Synthetic Minority Over-sampling Technique, ADASYN Adaptive Synthetic. Underline and bold numbers were the highest score of the classifier