Retrospective Cohort Study
Copyright ©The Author(s) 2023.
World J Clin Cases. Nov 26, 2023; 11(33): 7951-7964
Published online Nov 26, 2023. doi: 10.12998/wjcc.v11.i33.7951
Table 1 Participant demographics
Variables
mean ± SD
N
Age67.38 ± 9.69556
Body mass index26.16 ± 3.9556
Duration of diabetes13.69 ± 7.94556
Systolic blood pressure131.14 ± 15.42493
Diastolic blood pressure73.32 ± 10.15493
Hemoglobin12.92 ± 1.68444
Triglyceride153.74 ± 45.85539
Glycated hemoglobin7.79 ± 1.36538
High density lipoprotein cholesterol122.65 ± 74.34535
Low density lipoprotein cholesterol49.65 ± 14.75498
Alanine aminotransferase23.87 ± 13.94537
Creatinine1.16 ± 1536
Microalbumin creatinine ratio194.18 ± 733.73526
Homeostasis assessment-insulin resistance0.63 ± 0.34366
Homeostasis assessment-insulin secretion1.71 ± 0.37366
Table 2 Participant demographics – sex, smoking and sum stressed score

N (%)
N
Sex556
    0287 (51.62)
    1269 (48.38)
Smoking310
    0202 (65.16)
    1108 (34.84)
Sum stressed score556
    0180 (32.37)
    1376 (67.63)
Table 3 Summary of the values of the hyperparameters for the best random forest, classification and regression tree, Naïve Byer’s classifier, eXtreme gradient boosting
Methods
Hyperparameters
Best value
Meaning
RFMtry8The number of random features used in each tree
Ntree500The number of trees in forest
CARTMinispilt20The minimum number of observations required to attempt a split in a node
Minibucket7The minimum number of observations in a terminal node
Maxdepth10The maximum depth of any node in the final tree
Xval10Number of cross-validations
Cp0.03588Complexity parameter: The minimum improvement required in the model at each node
XGBoostNrounds100The number of tree model iterations
Max_depth3The maximum depth of a tree
Eta0.4Shrinkage coefficient of tree
Gamma0The minimum loss reduction
Subsample0.75Subsample ratio of columns when building each tree
Colsample_bytree0.8Subsample ratio of columns when constructing each tree
Rate_drop0.5Rate of trees dropped
Skip_drop0.05Probability of skipping the dropout procedure during a boosting iteration
Min_child_weight1The minimum sum of instance weight
NBFl0Adjustment of Laplace smoother
UsekernelTRUEUsing kernel density estimate for continuous variable versus a Gaussian density estimate
Adjust1Adjust the bandwidth of the kernel density
Table 4 The average performance of the LR, random forest, stellate ganglion block, classification and regression tree, and eXtreme gradient boosting methods

Accuracy
Sensitivity
Specificity
AUC
LGR0.685 ± 0.0720.687 ± 0.1520.683 ± 0.1140.703 ± 0.057
CART0.541 ± 0.0740.546 ± 0.0780.529 ± 0.6700.540 ± 0.070
RF0.707 ± 0.0470.711 ± 0.1000.678 ± 0.0990.707 ± 0.037
XGBoost0.712 ± 0.0720.727 ± 0.1390.674 ± 0.0880.719 ± 0.062
NB0.692 ± 0.0590.702 ± 0.1160.669 ± 0.0900.704 ± 0.056
Table 5 The variable importance and rank of the importance of the risk factors derived from machine learning methods
Variables
RF
XGBoost
NB
Average
Rank
Sex100.0 ± 0100.0 ± 0100.0 ± 0100.01.0
Body mass index54.2 ± 6.661.1 ± 14.786.2 ± 6.867.12.0
Age13.1 ± 7.678.3 ± 13.267.9 ± 6.553.13.0
Low density lipoprotein cholesterol30.4 ± 3.18.4 ± 12.871.0 ± 7.836.64.0
Glycated hemoglobin15.4 ± 5.912.8 ± 11.948.0 ± 8.325.45.0
Smoking12.2 ± 2.728.8 ± 9.234.5 ± 6.625.26.0
Creatinine10.1 ± 2.35.3 ± 9.1253.1 ± 7.322.87.0
Duration6.3 ± 4.6141.5 ± 8.610.1 ± 8.919.38.0
Hemoglobin8.0 ± 4.1616.6 ± 8.917.0 ± 5.713.89.0
Blood urine nitrogen9.0 ± 8.156.5 ± 6.7917.3 ± 9.611.010.0
Systolic blood pressure4.2 ± 1.0321.6 ± 5.16.4 ± 2.8810.711.0
Triglyceride5.4 ± 17.515.0 ± 4.411.1 ± 12.310.512.0
Microalbumin4.3 ± 2.233.6 ± 3.8322.7 ± 6.910.213.0
Diastolic blood pressure2.5 ± 5.9118.9 ± 3.75.6 ± 9.339.014.0
Alainine aminotransferase3.2 ± 5.966.9 ± 3.9013.0 ± 12.67.715.0
High density lipoprotein cholesterol1.3 ± 3.609.8 ± 3.297.3 ± 8.416.116.0
HOMA-IR5.7 ± 2.852.2 ± 2.5210.2 ± 8.16.017.0
HOMA-B4.3 ± 2.220.0 ± 0.007.4 ± 8.8313.918.0